Translating the software that powers Wikipedia
Strategies for success in free software localization
Amir Aharoni
Wikipedia is a website.
A website has content — the articles; and it has a user interface — the menus around the articles and the various screens that let editors edit the articles and communicate with each other.
Wikipedia is massively multilingual, so both the content and the user interface must be translated.
The easiest way to translate Wikipedia articles is to use Content Translation, and that’s a topic for another post. This post is about getting all of the user interface translated to your language, as quickly and efficiently as possible.
The translation of the software behind Wikipedia is done on a website called translatewiki.net. The most important piece of software that powers Wikipedia and its sister projects is called MediaWiki. As of today, there are 3,865 messages to translate in MediaWiki, and the number grows frequently. “Messages” in the MediaWiki jargon are the text that is shown in the user interface, and that can be translated. Wikipedia also has dozens of MediaWiki extensions installed, some of them very important — extensions for displaying citations and mathematical formulas, uploading files, receiving notifications, mobile browsing, different editing environments, etc. There are around 4,700 messages to translate in the main extensions, and over 25,000 messages to translate if you want to have all the extensions translated. There are also the Wikipedia mobile apps and additional tools for making automated edits (bots) and monitoring vandalism, with several hundreds of messages each.
Translating all of it probably sounds like an enormous job, and yes, it takes time, but it’s doable.
In February 2011 or so — sorry, I don’t remember the exact date — I completed the translation into Hebrew of all of the messages that are needed for Wikipedia and projects related to it. All. The total, complete, no-excuses, premium Wikipedia experience, in Hebrew. I wasn’t the only one who did this, of course. There were plenty of other people who did this before I joined the effort, and plenty of others who helped along the way: Rotem Dan, Ofra Hod, Yaron Shahrabani, Rotem Liss, Or Shapiro, Shani Evenshtein, Inkbug (whose real name I don’t know), and many others. But back then in 2011 it was I who made a conscious effort to get to 100%. It took me quite a few weeks, but I made it.
The software that powers Wikipedia changes every single day. So the day after the translations statistics got to 100%, they went down to 99%, because new messages to translate were added. But there were just a few of them, and it took me a few minutes to translate them and get back to 100%.
I’ve been doing this almost every day since then, keeping Hebrew at 100%. Sometimes it slips because I am traveling or I am ill. It slipped for quite a few months because in late 2014 I became a father and didn’t have any time to dedicate to translation, and a lot of new messages happened to be added at the same time, but Hebrew is back at 100% now. And I keep doing this.
With the sincere hope that this will be useful for translating the software behind Wikipedia to your language, let me tell you how I do it.
Preparation
First, let’s do some work to set you up.
- Get a translatewiki.net user account if you haven’t already.
- Make sure you know your language code (a 2 or 3 letter standard abbreviation).
- Go to your preferences, to the Editing tab, and add languages that you know to Assistant languages. For example, if you speak one of the native languages of South America like Aymara (ay) or Quechua (qu), then you probably also know Spanish (es) or Portuguese (pt), and if you speak one of the languages of the former Soviet Union like Tatar (tt) or Azerbaijani (az), then you probably also know Russian (ru). When available, translations to these languages will be shown in addition to English.
- Familiarize yourself with the Support page and with the general localization guidelines for MediaWiki.
- Add yourself to the portal for your language. The page name is Portal:Xyz, where Xyz is your language code.
Priorities
The translatewiki.net website hosts many projects to translate beyond stuff related to Wikipedia. It hosts such respectable Free Software projects as OpenStreetMap, Etherpad, MathJax, Blockly, and others. Also, not all the MediaWiki extensions are used on Wikimedia projects; there are plenty of extensions, with thousands of translatable messages, that are not used by Wikimedia, but only on other sites, but they use translatewiki.net as the platform for translation of their user interface.
It would be nice to translate all of it, but because I don’t have time for that, I have to prioritize. On my translatewiki.net user page I have a list of direct links to the translation interface of the projects that are the most important.
I usually don’t work on translating other projects unless all of the above projects are 100% translated to Hebrew. I occasionally make an exception for OpenStreetMap or Etherpad, but only if there’s little to translate there and the untranslated MediaWiki-related projects are not very important.
Start from MediaWiki most important messages. If your language is not at 100% in this list, it absolutely must be. This list is automatically created periodically by counting which 500 or so messages are actually shown most frequently to Wikipedia users. This list includes messages from MediaWiki core and a bunch of extensions, so when you’re done with it, you’ll see that the statistics for several groups improved by themselves.
Next, if the translation of MediaWiki core to your language is not yet at 13%, get it there. Why 13%? Because that’s the threshold for exporting your language to the source code. This is essential for making it possible to use your language in your Wikipedia (or Incubator). It will be quite easy to find short and simple messages to translate (of course, you still have to do it carefully and correctly).
Getting Things Done, One by One
Once you have the most important MediaWiki messages 100% and at least 13% of MediaWiki core is translated to your language, where do you go next?
I have surprising advice.
You need to get everything to 100% eventually. There are several ways to get there. Your mileage may vary, but I’m going to suggest the way that worked for me: Complete the easiest piece that will get your language closer to 100%! For me this is an easy way to strike an item off my list and feel that I accomplished something.
But still, there are so many items at which you could start looking! So here’s my selection of components that are more user-visible and less technical, sorted not by importance, but by the number of messages to translate:
- Cite: the extension that displays footnotes on Wikipedia
- Babel: the extension that displays boxes on userpages with information about the languages that the user knows
- Math: the extension that displays math formulas in articles
- Thanks: the extension for sending “thank you” messages to other editors
- Universal Language Selector: the extension that lets people select the language they need from a long list of languages (disclaimer: I am one of its developers)
- jquery.uls: an internal component of Universal Language Selector that has to be translated separately for technical reasons
- Wikibase Client: the part of Wikidata that appears on Wikipedia, mostly for handling interlanguage links
- VisualEditor: the extension that allows Wikipedia articles to be edited in a WYSIWYG style
- ProofreadPage: the extension that makes it easy to digitize PDF and DjVu files on Wikisource
- Wikibase Lib: additional messages for Wikidata
- Echo: the extension that shows notifications about messages and events (the red numbers at the top of Wikipedia)
- MobileFrontend: the extension that adapts MediaWiki to mobile phones
- WikiEditor: the toolbar for the classic wiki syntax editor
- ContentTranslation extension that helps translate articles between languages (disclaimer: I am one of its developers)
- Wikipedia Android mobile app
- Wikipedia iOS mobile app
- UploadWizard: the extension that helps people upload files to Wikimedia Commons comfortably
- Flow: the extension that is starting to make talk pages more comfortable to use
- Wikibase Repo: the extension that powers the Wikidata website
- Translate: the extension that powers translatewiki.net itself (disclaimer: I am one of its developers)
- MediaWiki core: the base MediaWiki software itself!
I put MediaWiki core last intentionally. It’s a very large message group, with over 3000 messages. It’s hard to get it completed quickly, and to be honest, some of its features are not seen very frequently by users who aren’t site administrators or very advanced editors. By all means, do complete it, try to do it as early as possible, and get your friends to help you, but it’s OK if it takes some time.
Getting All Things Done
OK, so if you translate all the items above, you’ll make Wikipedia in your language mostly usable for most readers and editors.
But let’s go further.
Let’s go further not just for the sake of seeing pure 100% in the statistics everywhere. There’s more.
As I wrote above, the software changes every single day. So do the translatable messages. You need to get your language to 100% not just once; you need to keep doing it continuously.
Once you make the effort of getting to 100%, it will be much easier to keep it there. This means translating some things that are used rarely (but used nevertheless; otherwise they’d be removed). This means investing a few more days or weeks into translating-translating-translating.
Here’s the trick: Don’t congratulate yourself only upon the big accomplishment of getting everything to 100%, but also upon each accomplishment along the way.
One strategy to accomplish this is translating extension by extension. This means, going to your translatewiki.net language statistics: here’s an example with Albanian, but choose your own language. Click “expand” on MediaWiki, then again “expand” on “MediaWiki Extensions”, then on “Extensions used by Wikimedia” and finally, on “Extensions used by Wikimedia — Main”. Similarly to what I described above, find the smaller extensions first and translate them. Once you’re done with all the Main extensions, do all the extensions used by Wikimedia. (Going to all extensions, beyond Extensions used by Wikimedia, helps users of these extensions, but doesn’t help Wikipedia very much.) This strategy can work well if you have several people translating to your language, because it’s easy to divide work by topic.
Another strategy is quiet and friendly competition with other languages. Open the statistics for Extensions Used by Wikimedia — Main and sort the table by the “Completion” column. Find your language. Now translate as many messages as needed to pass the language above you in the list. Then translate as many messages as needed to pass the next language above you in the list. Repeat until you get to 100%.
For example, here’s an excerpt from the statistics for today:
Let’s say that you are translating to Malay. You only need to translate eight messages to go up a notch (901–894 + 1). Then six messages more to go up another notch (894–888). And so on.
Once you’re done, you will have translated over 3,400 messages, but it’s much easier to do it in small steps.
Once you get to 100% in the main extensions, do the same with all the Extensions Used by Wikimedia. It’s over 10,000 messages, but the same strategies work.
Good Stuff to Do Along the Way
Never assume that the English message is perfect. Never. Do what you can to improve the English messages.
Developers are people just like you are. They may know their code very well, but they may not be the most brilliant writers. And though some messages are written by professional user experience designers, many are written by the developers themselves. Developers are developers; they are not necessarily very good writers or designers, and the messages that they write in English may not be perfect. Keep in mind that many, many MediaWiki developers are not native English speakers. Report problems with the English messages to the translatewiki Support page. (Use the opportunity to help other translators who are asking questions there, if you can.)
Another good thing is to do your best to try running the software that you are translating. If there are thousands of messages that are not translated to your language, then chances are that it’s already deployed in Wikipedia and you can try it. Actually trying to use it will help you translate it better.
Whenever relevant, fix the documentation displayed near the translation area. Strange as it may sound, it is possible that you understand the message better than the developer who wrote it!
Before translating a component, review the messages that were already translated. To do this, click the “All” tab at the top of the translation area. It’s useful for learning the current terminology, and you can also improve them and make them more consistent.
After you gain some experience, create a localization guide in your language. There are very few of them at the moment, and there should be more. Here’s the localization guide for French, for example. Create your own with the title “Localisation guidelines/xyz” where “xyz” is your language code.
As in Wikipedia, Be Bold.
Amir Aharoni is a contractor for the Wikimedia Foundation, improving MediaWiki’s support for different languages. He volunteers as a member of Wikimedia Israel and of the Language committee.
A longer version of this article originally appeared on Aharoni in Unicode, ya mama.