Localisation or: How I learned to stop worrying and love babel-plugin-react-intl

Published in

IDAGIO

8 min readDec 16, 2017

Internationalisation and localisation or i18n is dully defined as the process by which support for new languages is introduced into a system. At IDAGIO we are a classical music streaming service. We wanted to address an international audience so we started out with English support (British English; en-GB to be exact).

But we soon wanted the ability to cater to a wider, non english-speaking, audience as well. We are based in Berlin, Germany so naturally our choice for a second language was German. We released our platform in the de-DE locale in early 2017.

A big part of i18n is figuring its second part, the locales. Each locale has its own particularities: where to put a currency symbol, or how to pluralise, or how to format a date. These are all things that you, as a systems developer, don’t want to figure out on your own. You’re much better off relying on something that is created and maintained by people who have this as their main focus.

In the frontend javascript space the de facto solution is FormatJS created by the Yahoo Presentation Team. It is built on top of the ECMA Internationalization API and taken together they are an amazing attempt at standardised solution to these problems. As defined by Yahoo:

FormatJS is a modular collection of JavaScript libraries for internationalization that are focused on formatting numbers, dates, and strings for displaying to people. It includes a set of core libraries that build on the JavaScript Intl built-ins and industry-wide i18n standards, plus a set of integrations for common template and component libraries.

At IDAGIO when we started building our web frontend we chose React. The proposed solution of FormatJS in this ecosystem is the yahoo/react-intl library.

Once we found the framework we felt that the bulk of the work was done. The rest consisted of localising our applications’ strings.

Now, when you want to internationalise a string you use components. react-intl provides these, for instance FormattedMessage, which takes a id and a default value as props. The id is stored in a translation json file. This file contains a json object with key - value pairs. The key is the id of the string. You will have one of these files for every language you support.

What we did was take all of the static strings that needed localising in our app and put them in a en-GB.json file. Once complete we modified our app to use this json file. By breaking out our localisable content in this way we were in a position to build in any other localisation files we might need in the future.

Growing pains

To begin with when we needed a translation for a word we would ask on our #deutsch slack channel. We would then update the german translation file and that would be it. This held steady and very soon we were close to 500 translation strings in our localisation files.

This was not a good process. The lack of ownership resulted in a lack of consistency on tone and message. Plus this would not scale if we added additional languages.

When we hired a german-speaking communications manager we found it to be a good opportunity for formalising this process. To do so we introduced POEditor into our workflow. It’s a place to centralise all our localisation files across all platforms. Within it users can modify the translations freely, attribution and versioning are available, and when that’s done the devs can download the respective json files.

Before POEditor these translation files were living inside our github repo which made them less accessible for the non-developers. But now we had a process and specifically someone in charge of overseeing the translation strings. This person was now able to take a long look at our translation files and asked:

Wait… Is key.that.is.obviously.not.used still used?
- IDAGIO Communications Manager

It turns out we’d gotten sloppy and sometimes had forgotten to remove obsolete translation strings, sometimes the fallback (english) values would not match the values defined in the english version AND sometimes we would forget the localisation string all together.

BAD DEVELOPERS, BAD!

To err is human. We must look to the great thinkers for direction:

There’s an old saying in Tennessee — I know it’s in Texas, probably in Tennessee — that says, fool me once, shame on — shame on you.
Fool me — you can’t get fooled again.
- George W. Bush

Fool me twice

These types of problems: having something either be missing or not defined is not something we face very often. The reason is we have automated tooling that ensures that we have and use exactly all the variables in our system.

It’s called linting; and all the cool kids are doing it. For our code base we are running eslint and style-lint with pretty aggressive settings on and are really happy about it. We also have a no-dead-code policy that gets enforced by our linters.

Clearly we needed something similar for linting the localisation files. This tool should look at the english json (the source of truth) and compare it with usages in the system. Furthermore if it finds inconsistencies we should be notified and the linting process should error out. Do not pass go; don’t collect green badge. 🚨

Prior art: babel-plugin-react-intl

There exists another approach that circumvents this problem all together: yahoo/babel-plugin-react-intl.

The idea behind this project is that you would generate the fallback localisation file (en-GB.json for us) out of your source files; effectively inverting the control back to the source files. This is a great idea as it would relieve the need for all this linting business.

But the problem was that when we tried it we realised that we managed to sneak-in some unsupported usages into the mix. Because of these the tool couldn’t reliably identify what strings are used.

Unsupported usages

Like I said, localising strings is all about ids and values. There might be situations where the id might not be known at compile time. For example the id might be something coming from an external source, such as an API. As a specific example we can observe the functions of a profile. We currently do it like this:

(fnName) =>
  <FormattedMessage
    id={`profile.functions.${fnName}`}
    defaultMessage={capitalize(fnName)}/>

Our en-GB file contains the ids of all the profile functions:

"profile.functions.composer": "composer",
"profile.functions.soloist": "soloist",
"profile.functions.conductor": "conductor",
...

It does two things:

matches against an id, for example profile.functions.soloist and if it is found uses it
if it’s not found it will use the capitalised version of the value coming from the api (Soloist)

Why does this kind of usage cause trouble? When it’s impossible for the static analyser to compute that default value then it can’t correlate one translation (or a group of translations) to a usage point.

Provided with profile.functions.${fnName} the static analyser has no way of knowing that fnName will be a valid value (like soloist). “soloist” is coming from the API and babel-plugin-react-intl has no knowledge of the API and what it can return.

Because it doesn’t know the default values it can’t construct the default locale file (en-GB.json).

If you were to run the plugin you would get this error:

Messages must be statically evaluate-able for extraction

How to avoid unsupported usage

With some modifications we can change the code to avoid the error. We first need to define the keys inside of a defineMessages. This acts as hook for the system whereby the ambiguity still exists but is concentrated into the contents of this function.

const profileFunctions = defineMessages({
  composer: {
    id: 'profile.functions.composer',
    defaultMessage: 'Composer',
  },
  soloist: {
    id: 'profile.functions.soloist',
    defaultMessage: 'Soloist',
  },
...
});

And the change render function to use intl.formatMessage

(fnName) => profileFunctions[fnName]
  ? this.props.intl.formatMessage(profileFunctions[fnName])
  : capitalize(fnName)

When the API returns a fnName for a profile (like soloist) it’s only cross referenced agains the defineMessages declaration. This effectively limits the ambiguity to the set of pre-defined fnNames defined in the object.

By defining the id and default value in absolute terms we’ve made them statically analysable.

The lack of function calls to determine the id, previously…

`profile.functions.${fnName}`

… is now just the enumeration of possible values like:

profile.functions.composer
profile.functions.soloist
etc.

Same deal goes for the defaultValues; instead of a function call we just explicitly define them.

If we get something unexpected from the API we will just capitalise it.

With these modifications we’ve made the code statically analysable and the babel-plugin-react-intl happy. Hello green badge! DO PASS GO. ✅

If we didn’t still have these usages we could drop linting all together and just use the babel plugin to generate the json file. If we were to start over we would do our darndest to avoid these usages and use the babel plugin — but for now we have to make the best out of this imperfect situation.

Dealing with an imperfect situation

At the time we did not have the resources to fix the unsupported usages. We were stuck with doing things the old way, updating both the source files and the translation files by hand. It would, therefore, be likely we would make mistakes again. We wanted to compromise, to find a way of keeping on the straight and narrow until we could move to babel-plugin-react-intl.

We put together a simple script to do 3 things:

Use plugin-react-intl to identify all of the non-ambiguous usages
Do a exact match search on the source files for the ambigous ones
Check a list of ignored usages (problems we know about)

A “make the best of it” solution for the real world.

We then hooked this into our standard linting script, which meant we got it as a badge on our CI inside of github. You can find the script in this gist.

Again… you should not have this problem, but if you do, feel free to use our script to make the best of it.

Lessons learned

RTFM. For me this one is a big one. I still rush into developing without a good understanding of the problem space.
Exit codes, 0 for good, 1 for bad, nice to have a refresher.
Technical debt creeps in where you least expect it.
It’s worth putting time into finding the right process which includes people as well as tools.