Wikipedia:Wikipedia Signpost/2012-04-02/Technology report

Technology report

Somewhere amongst the endless discussions about Gerrit lie details of hackathons, performance blips explained and more

Questions about Gerrit dominate developer discussions

Simplified versions of the development workflow change, illustrating how code review will (when the dust has settled on the switchover) fit into the new workflow and giving a sense of the new vocabulary involved

The change in the core version control system from Subversion to Git, insofar as it can be separated from the change in code review systems, seems to have settled in well after last week's switchover (Signpost coverage). By contrast, the new code review tool Gerrit continues to prove controversial, spawning dozens of threads on developer mailing lists.

The issues raised (many of which seem, at least on the surface, to be fairly minor) are both too numerous and in many cases too technical to be adequately summarised in a couple of lines; nevertheless, in doubtlessly a positive sign, developers seem to be treating the vast majority of the problems encountered (such as an awkward system for responding to comments and the overly personal nature of the autogenerated taglines that accompany certain types of review) simply as issues – bugs needing to be fixed – rather than internalising them as complaints with the fundamentals of the new code review process. Indeed, work on a number of these issues has started already; others will however require changes to Gerrit itself. On the whole, developers seem to be hopeful that all their issues with the new code review process can be resolved, given enough time. Nevertheless, a handful of the the issues raised do seem to have real sticking power, including concerns that Gerrit's code review paradigm may be fundamentally ill-suited to the review of large or complex changes (wikitech-l mailing list), too difficult for new contributors to come to grips with, or overly conducive to the kind of endless bar-raising that would see the gap between old and new contributors continue to widen.

Though the current trend suggests that issues will continue to be either resolved or ameliorated over the coming weeks, a potential future fly in the ointment is a planned audit of Gerrit's performance in three months' time. Such an audit, a pre-switchover concession to those who initially disliked Gerrit, has the potential to lead to the code review system to being abandoned in favour of a competitor system such as Phabricator. Needless to say, should grievances with Gerrit be unresolved by then – with or without great appetite for a second difficult migration – the audit could be a difficult one to manage.

Chennai hackathon

Write-ups of the Chennai Hackathon (held in the Indian city on March 17) began to be posted online this week, giving an insight into the success of a hackathon with a deliberately broad remit. Overall, thirteen projects were demonstrated at the end of the day, including a "text-a-quote" service, a hand-held device-based pronunciation recorder and work on an instant image rotate function accessible from file description pages (wikitech-l mailing list). The quality that WMF localisation team member Gerard Meijssen perceived in many of the projects prompted him to comment how they "deserve attention [from the wide] public—they represent missing functionality or they have a different approach to something we are struggling with. They are all by people who have a keen interest in the projects of the Wikimedia Foundation and as such they represent our 'latest generation'".

In total, the hackathon (one of an increasing number of tech-focused Wikimedia meetups being scheduled across the globe) attracted some 21 programmers, overwhelmingly but not exclusively male. In writing up the event, WMF developer and attendee Yuvi Panda described why he thought coders at the "super awesome and super productive" event were able to get so much done in a single eight-hour day:

In brief

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

  • Google Summer of Code proposal deadline approaches: Time is running short for students wishing to vie for one of the small number of Google Summer of Code (GSoC) places available at the Wikimedia Foundation this summer. Students, of which eight were accepted last year, have until Friday to make proposals for development projects that they would like to work on over the summer months in return for a stipend provided by Google and worth up to US$5000. This year's WMF-GSoC coordinator, Sumana Harihareswara, described the selection criteria that would be used to whittle down the 20 or more applicants (of which your author is one) as "quantity over quality... [and] promising continuing contributors... [over] specifics of their proposals" (MediaWiki.org). Those only just realising that the programme might be for them are advised to make contact immediately; the list of accepted proposals will be made public by April 23.
  • Wikidata scheme launched: The Wikimedia movement's first new project in six years was launched this week by Wikimedia Deutschland with funding from prominent donors. Wikidata, a machine-readable central repository of data accessible to all, is the culmination of years of discussions and plans on how best to co-ordinate updated statistical information and metadata across multiple Wikimedia projects. To learn more on this "Commons for data", see this week's "News and notes" report.
  • Performance problems: Performance issues for certain users (beginning on March 21) were traced this week to a network connection problem at EQIAD, Wikimedia's Virginia-based datacentre (wikitech-l mailing list), allowing them to be resolved with immediate effect on March 24. This did not stop separate problems developing on March 29, however, which had to be resolved by power cycling a deficient server (also wikitech-l). In unrelated news, search functionality was broken for some time on March 31; the problem has since been traced to a single host issuing dozens of search queries simultaneously. It is not known whether they did so deliberately or accidentally.
  • Toolserver lag to end soon... possibly...: Over the last week, Toolserver lag problems have continued to prove problematic for visitors. Toolserver admin DaB has however disclosed on VPT that the slower-than-expected restore of the database is hoped to conclude sometime on April 3. The Toolserver is normally able to catch up at about 2–3 times the rate of generation, so for each day the backup is old, catch-up will take eight to twelve hours. Based on these estimates, once the database restore completes, it will be less than a week before the lag is resolved and normal toolservice returns.
  • Berlin hackathon registration now open: Registration is now open for the fourth annual Berlin hackathon, being held in the German capital on June 1–3 (wikitech-l mailing list). Over 100 guests are expected for the "premier event for the MediaWiki and Wikimedia technical community", drawn from the "user scripts, gadgets, API use, Toolserver, Wikimedia Labs, mobile, structured data [and] template" communities. The hackathon last year brought together some 96 attendees from four continents; no closing date for registrations has yet been announced.
  • Non-WMF-deployed extensions to begin migration to Git: The first non-WMF-deployed extensions sitting in the WMF Subversion repository will be migrated to Git on April 6 (MediaWiki.org), the permission of their principle maintainer(s) notwithstanding. Extension maintainers will then be able to choose between pre- and post-review models for their extensions, and be given control over who is allowed to review code submitted to it. In the case of over 500 extensions, it is yet to be decided whether or not they will follow in the footsteps of their WMF-deployed counterparts, who were migrated on Git day itself; eventually, some will be converted, others will "move out" and continue to use Subversion, while other (mostly abandoned) extensions will be left in the WMF Subversion repository, frozen in read-only mode.
  • Five bots approved: 5 BRfAs were recently approved:
    1. SD5bot's 2nd BRfA, replacing {{ndash}} with {{spaced ndash}}
    2. PALZ9000, updating orbital data on space stations and other satellites
    3. Thehelpfulbot's 12th BRfA, tagging templates that are up for discussion at TfD when requested, such as for this task on Bot requests
    4. MadmanBot's 14th BRfA, changes host of (dead) external links to www.zerozerofootball.com to www.footballzz.co.uk. Replaces links to http://www.zerozerofootball.com/jogador.php with {{Zerozero profile}} to accommodate any future changes in domain name
    5. Helpful Pixie Bot's 49th BRfA, links "Expand language" tags to their correctly interwikied counterpart
At the time of this writing, 14 BRfAs are active. As always, community input is encouraged.