Image for post
Image for post

Application localisation: helping translation and development get along

Alexey Timin
May 5 · 19 min read

Hello! I’m Alexey Timin, a Senior Software Engineer at Magiclab, in charge of localisation system development. Let me tell you more about our localisation system.

We work on two products: Badoo and Bumble and operate more than 150,000 phrases and texts, translated into 52 languages. Each project has its own users, its own market, its own style of communicating with users, and different versions for web and mobile platforms.

In this article I am going to describe our localisation, quality control and translations release processes, and, most importantly, how we achieved positive feedback on our translation system from our developers.

We have 300 developers working on projects. We have successfully managed to segregate responsibilities of translators and developers such that both groups can work independently and in parallel.

To start with, let’s have a look at what the localisation process looks like in our company.

Image for post
Image for post

In this diagram all minor details are not shown — you don’t need them to get an overall understanding.

We start with a Product Requirements Document (PRD). Then the client and server-side development team start working on the feature. And, in parallel, the translation process is being performed.

The PRD and the final release stage are highlighted in the same colour; this means the result needs to match the PRD. If the PRD lacks some details, then it won’t be clear to developers on who is responsible for what. Is it the mobile developers who have to integrate a text into the client part? Or the server developers who need to return it from the server in response to a query?

Let’s get to the bottom of all this. Before we go any further, I would like to introduce and explain one particular term to you, a ‘lexeme’.

A lexeme is any unit of text which needs to be translated. This might be a text on a button, a header or an entire paragraph.

Now, we can move on to the main part.

PRD

Image for post
Image for post

The table of lexemes specifies whether a text is returned by a server or integrated into the application. It is imperative that a key is given: if a text has been used before, then the key will be contained in the table; if, however, the text has not been used anywhere before, then a sequential number for the text will be given, and the developer will be able to set a suitable key.

Reusing text is a very dodgy issue. On the one hand, it speeds up the localisation process, but on the other, you might find yourself in an odd situation.

Using an example will help me explain why. On one occasion, we had a question in our app (in English), “Do you smoke?” to which the possible answers were “Yes” and “No”. Here we can see three lexemes: two for the answers and one for the question. The question was translated into Russian as, “Do you smoke?” and the possible answers were “I smoke” and “I don’t smoke”. Then we decided to carry out another survey and to reuse the possible answers from the previous question. In English it all looked right: “Fancy going to a party?” — “Yes”/“No”. In Russian, because we were reusing lexemes, the result was the following ‘exchange’: “Fancy going to a party?” — “I smoke”/”I don’t smoke”!

So, now, when we put together a PRD and are deciding on whether to reuse the text, we take account of the context in which it has been used before. We also specify whether a lexeme is returned by the server or integrated into the client and delivered to the clients via the App Store or Google Play.

These techniques save time because they obviate the need for discussion at later stages.

Translations

Let me detail for you what we start with, how we communicate context to our translators, maintain a common style and then check the final result.

Translation flow

An example. Let’s say we need to make a greeting less formal. Initially, in English we had “Hey”, in Spanish “Hola”, in French “Salut”, in Russian “Privet”, in Australian “G’day mate” and in Mexican “Que onda” (“What’s the wave?” — Mexicans are cool!). Making text more original involves changing the English source text, at which point the translations into other languages become incorrect and they have to be checked and tweaked. We always alert our translators to this issue.

The impact of context

Let me explain using some examples.

Just so you know, I would like to point out that some of the examples are screenshots of popular websites and applications, but we don’t need to know their names; we are simply considering the most common types of mistakes that occur when it comes to localisation.

Image for post
Image for post

This is a sign at a petrol station. The English translation says “gun” (actually, correct translation is nozzle). But “gun” for an English and American person means a weapon. In this context, the phrase “Remove the gun from your fuel tank filler neck,” sounds a bit strange.

In the next example, the creators of an application decided to create a universal version of a text both for men and women — apparently, there was some advantage in doing so.

Image for post
Image for post

Xотел(а)” is actually not a real word, it’s a combination of the male and female form of the verb “to want”. The user has to choose the correct variation by themself. You can compare the example with this inexistent word “(wo)men” — it looks weird. We try to avoid translations like this.

The next example shows how the original sense of the text can get lost in translation. Look at the Russian on the right: we are being offered the opportunity of chatting with ourselves. But actually, the original meaning was an offer to link to our Instagram account.

Image for post
Image for post

These sorts of mistakes occur when translations are carried out with no reference to context. That’s why we specify the following for each lexeme:

  • description (what is the lexeme about, where it’s going to be used, etc.)
  • an image which shows you elements that will appear next to the text on the screen
  • a note as to whether the text will be shown to male or female users — so that translators can work out whether they require different translations, or not
  • types of variable (this is a very important point — I will cover this in more detail when we come to look into the development process)
  • the maximum length of the text: this is very important for push-notifications because of the limited screen width on mobile devices.
    Also, we always need to divide large texts into parts. This is handy if you need to do a search or make changes later on.

Let’s look into this point in more detail. When we divided up a text, we lost the connection between particular phrases and sentences. That’s why it is imperative that we show translators what came before and what comes after this text. This is relevant, for example, in the case of legal documents — so they are translated correctly.

Also, translators need to be alerted to any regional terms or jargon words present in the lexemes. For example, take the sentence, “Unlock your Likes List to see everyone who’s interested at once”. The translator needs to know that `Likes`, here, refers to a special directory in the application which contains user contacts who have liked a profile. Another similar example would be the term “Stories”. Ten years ago, when someone heard the word “Stories” they wouldn’t have thought of Instagram. Nowadays, Instagram is the first thing people associate with that word.

So, a translation depends on context, and namely on the following elements:

  • user gender
  • singular and plural in the text: “You have only one friend” and “You have ten friends”
  • platforms: Web, Android, iOS
  • the project for which the translation is being done.

Sticking with the final point for a moment. Here, how lexemes are translated often depends on the project to which they belong. This is important because each project has its own distinct style.

For example, here are headers for letters sent to a user when their account has been blocked.

For Badoo: “Your account has been blocked.”
For Bumble: “You have been blocked.”

In order to retain a common style as part of each project, you need to give translators access to the translation history. We have a tool called Translation Memory (TM). The translator always has access to information on matching translations and the percentage of similarity: they can either use the old translation or enter a new one. We don’t only show translators texts which are 100% identical, but also ones that are less similar, and we always highlight the differences.

Image for post
Image for post

Besides allowing style to be maintained within a given project, Translation Memory also helps speed up the whole translation process because translators don’t need to enter the same text a second time.

Grammatical cases and numerals

Translators populate this matrix for various words in each language. Whenever a new word is required, it gets added to the table.

The matrix helps avoid incorrect plural form use:

Image for post
Image for post

The Russian word подписчика means subscribers, but there are way more plural forms in Russian than in English, and the one used here is incorrect.

The advantage of this tool is that the form required doesn’t get chosen until immediately before it is displayed/shown to the user — in runtime. This is how it works:

Image for post
Image for post

For example, let’s say we have a translation into Russian.

Credits” in the middle is an identifier, a link to the case matrix.

Credits amount” on the right is a number which comes from the developer.

@3 designates a grammatical case (here it’s the accusative case), which has been specified by the translator.

So the entire phrase will be shown in Russian using the relevant grammatical case automatically. Awesome!

Translation checking

We automatically check for omitted emojis or variables. If a translator has accidentally removed a variable, the phrase in question loses its structure and sense. Compare: “You have 10 credits” and “You have credits” — in the second case the phrase has been corrupted and the sense has been lost.

We also check for missing HTML, otherwise the layout will go awry.

And we also warn a translator if their translation is longer than the original text. At that point, the translator needs to check it for accuracy and whether it fits the screen width.

Let’s highlight the main points of the translation process:

  • translators need to understand the context
  • the translation system needs to be sufficiently flexible so that a suitable translation can be made in every language and that a translator isn’t compelled to choose some universal wording. There has to be support around inflections and grammatical cases
  • there has to be automatic checking.

Help from users

A/B testing

We had two options to choose from: “Are you ready to meet new people? Join us!” and, “Just a few more steps… and you will be part of Badoo.” As a result of testing, we established that more users completed registration when they saw the second version of the push notification, so that’s the one we kept.

Below you will find a full list of the elements a translation depends on. As you can see, the fifth element is the A/B test: a user ending up in any given group means they were shown the relevant version of the text.

Image for post
Image for post

Collaborative translation platform (CTP)

What is the benefit of this approach and why is it important? When you don’t have a translator into a local dialect, let users do the work. As it turned out, they were really pleased to take part in the development of a project they liked.

We have a collaborative translation platform (CTP). You can access it using your Badoo account and vote for the best translation.

Image for post
Image for post

This is a screenshot of a window inviting translation into German. Each user can add their version. As soon as one of the options reaches a threshold of votes, we show it to our in-house translator and they can use it as the main translation (on the condition that it complies with the style and rules for the project in question, isn’t offensive etc.).

Don’t be afraid to ask users for help. They will put you right and their assistance.

Development

There are two main challenges here: how to organise development such that it runs in parallel; and how to keep a track of errors when using lexemes and so ensure that the correct translations are displayed at the right time.

Development in parallel

Image for post
Image for post

The old arrangement whereby we had to merge changes (different lexeme keys provided by the two developers)

Nowadays, we create and change lexemes centrally in the localisation system. Developers simply download a set of lexemes before they start working on a task. They write code, use the lexemes calling for them by their keys, and that’s it! They don’t need to think about anything else; translation-related questions are left to the translators.

Mistakes made in lexeme use

Image for post
Image for post

For example, if you are in a hurry it is easy to confuse “credit_amount” and “credit”. In order to prevent such things from happening, we introduced a control mechanism, a so-called ‘text container’, to oversee the translation and identify the type of variables used in a particular translation. It performs substitutions and checks that values of expected types are sent for substitution. If all the substitutions are done, then the container returns a string only, which can be displayed to the user. If not, then the same sort of container is returned. If we attempt to display a translation before all the substitutions are done, then we get a warning in the logs and know where the problem is.

Main points regarding development:

  • developers shouldn’t be having to think about localisation, changing text and such things.
  • you need to check what developers do, and it is also better if this checking is automated — it spares the nerves of all involved in the process.

Quality control

Let’s start with some examples. How many mistakes can you find on this screenshot?

Image for post
Image for post

I can find two:

  1. the long translation clearly doesn’t match the screen size. In this case, almost everything is just truncated and the caption doesn’t fit into the button.
  2. not all the lexemes are translated into Russian

In the following example, besides the text being displayed in various languages, we are also being offered the option to “experience up difficulties”.

Image for post
Image for post

Because: “Узнать больше” (“More details”) has been truncated to “Узнать боль…”, it now means “Experience up difficulties” instead 🤣

Remember: quality control is essential 👆

Quality Control options

Test version
The first thing which comes to mind is to check the translation on a test version of a website or application. That is to say, simply run it and see whether what comes out corresponds to the design, plan, technical brief and so on. Using this method we caught this mistake in a push notification. The message dedicated to a male user was sent to a test female user:

Image for post
Image for post

Application screenshots

We developed a special tool which takes screenshots in the test environment of all the mobile application screens in all languages. Anyone in the company can see what the screens look like, via the browser. This also has a special mode, showing identifiers for all the displayed texts. This is very helpful when it comes to debugging; you can see quickly what lexeme it is and why it ended up where it did (e.g. perhaps we have reused the program code which uses the lexeme).

Provided you have a web version and you just need to collect its screenshots featuring lexemes. You could integrate lexeme markers into the source code and write a plug-in for Google Chrome. The plug-in in QA engineers’ browsers could send screenshots of pages where it finds the lexemes, into the localisation system.

<ul>
<li>...</li>
<li>
<!--lexeme_12345-->
Contacts
<!--lexeme_12345_end-->
</li>
<li>...</li>
</ul>

We have been using this method for quite some time. Within the first few weeks, it was already allowing us to collect a huge number of pictures. But we discontinued this because it only allowed us to obtain images of the version which had already been developed, while in the meantime we had learnt how to gather pictures when development was not finished yet.

Quality control during the translation process

So, this is how the tool for carrying out quality control during the course of translation came into being.

Image for post
Image for post

Let me explain the principle behind it. Our designers use Sketch, an application for creating interfaces, including ones for mobile applications. We have learnt to replace texts in Sketch files and, using the Sketch program interface, to generate screenshots of what we need on-screen. Now, as the translator is working on the text, we are able to show them screenshots in their language immediately. And we can do so even before developers start creating the first version of new functionality.

Later on, we offered this solution as an open-source (article, code).

Translation audit

Main points regarding quality control

Release

Versioning lexemes

How? There is a branch of lexemes assigned to each branch of the task in Jira. When we incorporate any changes from any branch into a new version of the project, a new version of lexemes becomes available immediately. If we need to undo something, we simply remove a branch of the task from the new version and, along with it, a version of the lexemes with translations into all languages.

When a lexeme undergoes testing or when users can already see it, you need to be very careful; it is better to avoid making any changes to it and create a new version instead, assigning it to a ticket and, along with a new release, deploying a new version of the lexeme.

Versioning translations

Wrong: “It’s a remath”.
Right: “Its a rematch”.

In English, you shouldn’t use the straight apostrophe. Also, the letter “c” has been missed out.

Versioning lexemes and versioning translations are two different things. A translation can be corrected at any time: when a task is in development when it is at the testing stage or even when the functionality has already been delivered to users (there will be no harm done if the users see a correct translation in a new version of the application).

Deployment to different platforms

What you show a user comes in either from the server or what they have on their smartphone (for example, integrated translation).

Image for post
Image for post

A translation passes from the server to the user via our production server, to which we can easily deliver updated versions of files with translations.

And an integrated translation has a long pathway; it passes via the App Store or Google Play. The user downloads an update and only having done so do they see the corrections. This process seemed too slow for us so we came up with our own updating mechanism, “Hot Update”. At the click of a button it allows us to generate a new version of translations and to let all the users around the world know there is something new to download and use.

Image for post
Image for post

When an application is run on a mobile device, it sends a notification to the server that it has just launched and communicates the current translation version. If the localisation system has an update ready, it responds by sending an appropriate notification. The smartphone downloads the update and applies it.

The user will see new translations when they move on to the next screen. We have two articles written about implementing this solution: part one and part two.

Release: main points
During the release process, it is imperative to take the application’s pathway from you to your users into consideration. Different parts of your application probably update differently.

Final conclusions

Image for post
Image for post

What you need to bear in mind when looking at implementing a translation system:

  • write a detailed PRD
  • take context into account and give translators access to it
  • keep a history of translations in order to be able to maintain a common style within a given project
  • automate quality control (otherwise, that translator, who might be several time zones away, might do everything their own way)
  • free developers from having to make decisions about non-profile tasks. They are the ones who create new versions of your product, bringing joy to your users and giving you a feeling of satisfaction about the project you are creating.

Some additional materials I would like to share with you:

Bumble Tech

This is the Bumble tech team blog focused on technology and…

Alexey Timin

Written by

Bumble Tech

We’re the tech team behind social networking apps Bumble and Badoo. Our products help millions of people build meaningful connections around the world.

Alexey Timin

Written by

Bumble Tech

We’re the tech team behind social networking apps Bumble and Badoo. Our products help millions of people build meaningful connections around the world.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store