Localizing your react app (the right way)

Published in

IDAGIO

9 min readDec 5, 2018

Background

We’ve previously covered how we at IDAGIO went about to implement localization in our frontend. If you haven’t read that article yet, I recommend doing so first. It describes the underlying technologies we use and the basic steps to a localized and internationalized app. You’ll also get an understanding of corners we cut and why we did so.

This initial approach served us well for a long time and allowed us to target both an English and German speaking audience. The implementation did however have its limitations that we were well aware of. This meant that targeting further markets and in extension supporting more than two locales would require some extra work in the l10n and i18n department. In this article, I will outline how this work unfolded and what it resulted in.

We found ourselves in the situation where we wanted to prepare for a wider audience and more or less an unknown number of new languages. Since we liked parts of the approach that we already had established, we set out to improve it and try to make it as flexible and efficient as possible. Despite its merits, what was already in place had some clear pain points:

Everything had to be kept in sync manually. This included strings in the source code, json files containing the translations and POEditor (our online service where non developers can add translations). This was an error prone process with multiple pitfalls causing diverging states of files, overwritten translations etc.
Missing translations stopped features from being merged. Often the last step for a developer working on a feature would be to beg someone to translate any affected strings. Translating something also tied up development resources since someone needed to edit files within the source code repository.
There was no formalized process on how to do things. What should be changed first? At what point should the locale files be uploaded to POEditor? If something is conflicting, which is the correct value? Etc.

Principles of 2.0

To remedy the weaknesses of the old system we decided on a few key points that the new version should stick to.

Single source of truth. There should be no confusion about which underlying English string is the correct one. If inconsistencies appear, it should be clear what to use as a reference.
Arbitrary amount of languages. There should be no upper bound to how many languages we are able to support. Adding a new language should require minimal effort from developers.
Decoupled. Adding translations and developing features should be kept as independent as possible.
Automated. Automate as much as possible. This makes the process both more efficient and less prone to errors.

Implementing the principles

In our previous article we touched upon how babel-plugin-react-intl can be used and how we, regrettably, weren’t aware of this package at first. The way we were using react-intl was not fully compatible with the accompanying babel plugin which kept us from fully taking advantage of it.

Why does this kind of usage cause trouble? When it’s impossible for the static analyser to compute that default value then it can’t correlate one translation (or a group of translations) to a usage point.
- Somewhat remorseful IDAGIO developer

But I can now happily declare that it is the backbone of our new localization pipeline!

Extract script

What said plugin allows you to do is to extract all of the messages from your source code at compile time. By default it will create a new json file for every javascript file that contains one or more messages. Those files will be placed in a structure mirroring your source directory. This was not very useful to us so we ended up using it in a slightly different way. We put it in a separate script that is unrelated to our build process. This script traverses all of our source files, performs the extraction, and, most importantly, accumulates the messages into one single json object which it then writes to file. This allowed us to make our source code the single source of truth for messages. The json file generated in our script is a 1:1 representation of the react-intl messages defined in our source code. In mathematical terms, the extract script is a bijective function where every react-intl message maps to a distinct entry in the json file and vice versa (as long as identical messages in the source are regarded as the same member in the set of declared messages).

In order for this to work, certain rules have to be adhered to when developing:

Ids must be declared statically (this was covered in detail in our previous article)
Always use defineMessages when using the imperative formatMessage api. This is the only way to communicate that it’s a react-intl message that can be extracted.
Message ids declared more than once must also have identical defaultMessage’s. This will otherwise break the function’s bijective property and ultimately cause ambiguity.

Linter

To enforce these rules and help developers, we also developed a small lint script. The heavy lifting is again done by babel-plugin-react-intl as itself will error for a lot of cases. For instance, it will error if you violate point 3. But since we extract messages from each file separately and not as expected, from a concatenated bundle, this check does not work across files. We therefore added a custom check that enforces this rule.

Unfortunately there is currently no way to enforce point number 2 mentioned above. But so far we haven’t found this to be a problem as we try to use react-intl’s imperative api as little as possible. Possibly this could be checked at run time in development by tagging messages passed through defineMessages and then checking for this tag in formatMessage.

Communication with translation services

In our case we use POEditor as a means of storing and editing translations but in reality, this could be any comparable service. For us, its role is simply to supply a platform where non developers can translate english into other languages. We interact with POEditor in two ways, we push the extracted messages from our source code and we pull new translations into our source. In the same way that our source code acts as the source of truth for the default copy, POEditor acts as the source of truth for translations. This is also where automation comes into the picture. We wanted to avoid having to do manual uploads and downloads through a browser and offload as much as possible to CI.

CI and automation

We knew we wanted to automate as much as possible but a few questions had to be explored. What was possible to automate? Were some steps more suitable as manual actions? When and how should automated tasks be triggered?

Front end developer thinking about CI configs.

In the end we managed to automate almost everything, only one task (pulling translations) remained a manual command. The automation happens in two different CI jobs that are both triggered when something changes on our development branch. We’ve always had a job named tests that unsurprisingly runs our tests, but also checks that our linters pass. It was therefore natural to also add the new linter to this job. The rest of the automation takes place in a new job called intl. It uses our extraction script in conjunction with the GitHub and POEditor apis. After the extraction has happened, it uploads the new file to POEditor and then proceeds to open a new pull request containing the changes. This achieves two things; 1. Translators can start working on the new messages, 2. Our default locale file will be in sync with the source code (after merging the PR). It is also worth noting that this job does nothing if the change that triggered it doesn’t contain any new messages.

In the diagram below you can follow the high level flow of the resulting automation. Follow the numbered steps to see the sequence of events of a typical use case.

Schematic outline of CI job.

The manual step mentioned earlier consists of locally triggering a script very similar to the intl CI job. It pulls all translation files (i.e. not English) from POEditor, commits them and opens a new pull request. This can be done at any time and also does nothing if no new translations were detected on POEditor. The reason for keeping this a manual step is that we were unable to find a suitable condition for when to trigger it. For instance, it is not very practical to run in on every new translation. That would yield one new pull request for every new translation. It would be nice if there was a way for translators to “commit” their changes so that the translations would be batched into one pull request. But for now we’re sticking with a manual trigger which has turned out to not be too bad after all.

Front end developer successfully automating things.

Conclusion

We’ve been using this workflow for quite some time now and haven’t run into any major problems so far. There is certainly room for improvement but over all, it has made our lives easier and our code more robust.

It has to be said that it does add a bit of overhead, since you have to stick to a set of rules when doing something as trivial as adding copy to your app. But once you get used to this, you will experience a sense of relief in other areas. You can now fully focus on actually building that new feature you’re working on and leave the translations for later. And best of all, you’ll never have to manually sync those json files again!

There have been hiccups related to automation but since our flow is to open pull requests and not directly push to develop these have been easy to detect. There is always a manual step for applying changes done by CI and even these PR’s have to pass review!

NPM package

Should anyone be interested, we’re releasing the extract script and the linter as an npm package. It wraps babel-plugin-react-intl and exposes utility functions that can be used to set up your own workflow. The wrapped babel plugin does not need to be installed in your project, you just use the exposed functions and your current babel setup.

The push and pull scripts and the CI config are a bit too specific to our setup to be useful to most people. Our automation flow is by no means a silver bullet and the npm package doesn’t necessarily have to be used the way we use it. Manually triggering the extract script still saves you time and effort!

Future work

There are still things that can be improved and things we want to correct in the future:

Automating the fetching of translations somehow. Or at least notifying devs when there are new translations available.
Making the linter detect missing usages of defineMessages when using the imperative api.
The automation process sometimes opens duplicate PR’s. It’s a minor problem but we still don’t know the cause.
Adding contextual descriptions to messages. This would make it easier for translators that are less familiar with the product. This is supported by react-intl.

Thanks to Vlad Goran for help and support!