Crossing International Borders

Much like moving abroad isn’t as simple as getting on a plane and going, deploying an application for a global market takes time, planning and attention to detail. There’s a term for this: Globalisation.

Globalisation, or g11n as it’s referred to by its acronym (g and n for the first and last letters of the word and 11 for all the characters in between in English), is defined (by MSDN) as the process of designing and developing applications that function for multiple cultures.

“Globalisation is the process of designing and developing applications that function for multiple cultures.

It’s different to localisation (l10n), which is the process of customising your application for a given culture and locale. In other words, being able to communicate with the enduser in their language, in a way that makes sense to them.

“Localisation is the process of customising your application for a given culture and locale.”

There are over 80 technical I18n rules that developers need to follow, and the complete list is given in the Globalisation Design Guide by IBM. I will not attempt to cover these. Instead, I shall mention only a few to give an idea of the breadth and depth of what going global entails.

Translations are the obvious first step toward the end goal but often one needs to consider more subtle differences such as the way dates and times are formatted, number and currency formats, the proper rules for pluralisation, social acceptability of graphics and social media to mention just a few. As you can tell, the complexities multiply!

When translating from one language to another, one also has to keep in mind differences in string (text) length, text directional flow, the size of actual characters and the difference between character sets in different languages. It’s recommended to use UTF-8 wherever possible, which accommodates these needs.

Date and time formatting is another interesting one. In the US, dates are typically written as month/day/year, whereas the rest of the western world uses sequential formatting, i.e. year/month/day or day/month/year. Think about a user that’s booked a lovely summer vacation for the first week of June in New York City, only to find out the actual booking was made for the 6th of February!

Another tripping point is time. Not all cultures know that noon refers to mid-day, or 12:00. Some cultures use a 24 hour clock and military time, so scheduling a newsletter to be sent at 16:45 would be interpreted as 4:45 pm. In other cultures, time is referenced as a 12 hour clock, and the general public may be very concerned that suddenly their day has an additional 4 hours and 45 minutes!

Pluralisation is something that’s often overlooked. In English, we have zero, one and many, whereas other languages have zero, one, few and many. Some languages even change the actual word depending on what is being pluralised. Be very careful here, and always ensure a native checks everything over.

Some languages have formal and informal dialects and care has to be taken to properly convey the original intent of the message. As an example, utter confusion might reign if a customer clicks on a seemingly happy-go-lucky newsletter that, on translation, resulted in a very formal message!

So how do we successfully deploy anything in more than one locale without developing a separate front-end for each and every locale we’re targeting? Luckily, the browser has a wonderful object, the intl object, that‘s available through the ECMAScript Internationalisation API.

The Intl object methods available in Chrome.

(Note that the only browser that does not currently support these API’s in Safari. here we can, however, use polyfill, a shim for a browser API)

Internationalisation is by no means a new problem. It’s been around for a while and, as such, various libraries have been released to help ‘ease the pain’. I’ll briefly discuss one of these, I18n-JS and leave you to explore the others.

I18n-JS

The wonderfully small I18n-JS library is a Ruby gem that provides I18n translations on the JavaScript. I chose to build a demo Invoice using Express and React to explore this library, since that is what I am currently learning at bootcamp.

A comparison between four localisation implementations of a simple invoice is shown in the image below, with the first two comparing US and GB localisations in English.

locale: en-US
locale: en-GB

The only difference between the English tables is the way the date is formatted and the currency symbol used. Language wise, nothing’s changed.

locale: af-ZA

The Afrikaans version of this table shows the effect on the user interface when the language changes from English. Words are longer, often being combined, even table headings are longer and, in this example, the table only provides just enough space to render out the full “Unit Cost” and “Quantity” columns. You’ll perhaps notice that the technical reference to the website address in the first line item has not changed. Technical terms remain as they are and this is not a bug! Translators, when provided sufficient context to what they’re translating, will know this.

The Afrikaans version also nicely shows what happens when currencies are converted. In Rands, a previously small dollar amount is now much larger. If the exchange had been to a currency resulting in something even larger, the space provided in the table might be insufficient.

locale: de-DE

The last example shows a distinct difference in the way delimiters are used in different locales. In German, we use a “.” to separate thousands. German also shows the first appearance of an umlaut, a mark ( ¨ ) used over a vowel, to indicate a different vowel quality. These are widely used in non-english languages.

I18n-js is very versatile, easy to incorporate in an Express app and uses the app’s cookies to implement locale on all routes. I did have to install some cookie parsing middleware (I used npm’s cookie parser) and mounted it below the body parser middleware. The server.js file is shown partially below (see here for all the code).

server.js for an Express app using LOCALE on the render object.

I’ve added the key for LOCALE on the request cookies with a default for en-US. Always include a default!

Within the routes file, we grab the requested locale from the request parameters and set that on the cookie before redirecting the user to the requested page:

server side routing

On the client side, we import I18n-js in the main react app so that it’s available to the child routes rendered within the main route. From there we head off to our component and modify all that previously hard-coded table text to change dynamically based on the local on the cookie:

Dynamically render table header translations
Snippet from the table data code

I chose to include currency conversions directly in my demo app to illustrate differences in number formatting for larger numbers. This is purely for demonstration purposes and should not be used in production! If you did want to include something along those lines, one could design and develop accordingly.

All the items rendered within the table are values from the relevant translation object. It is this object that one would send off to be translated by professionals (or at least native speakers of the language if there wasn’t any other option!).

A snippet from the object providing the table content is shown below.

An I18n object used for German localisation

The I18n object is versatile and one can pretty much give it any and all keys you need. I kept it simple and used nested objects only when necessary. The gem provides details about attributes that should be used and the tests show formatting of numbers and dates.

For this simple example, I went with a Google Translate but cannot be certain that the actual translations are ideal. Even though I speak Afrikaans, it’s been such a long time since I’ve used it like a native, I had to use Google to translate a lot of that as well. Be sure to double check with a native speaker that the content, when rendered with the surrounding context, makes sense, otherwise you may end up with some very confused end-users.

Ideally, localisation should happen after development so that the translators have more of an idea of the intent of the communication. Ideally, one would also have translation tools that allow translators to work directly inline. This makes their task so much easier as they don’t have to infer context and also addresses the idea of formal/informal language discussed above! It also gives the translators more freedom to reword sentences appropriately to keep within the rendering boundaries provided. If not possible, make segments as long as possible to portray the context, or, at the very least, annotate them. On the topic of annotations, keep in mind that yaml and javascript files support comments, while json format does not.

After all that, one has to face the fact that any development that happens sequentially means extra time, so often internationalisation and localisation occur concurrently. It’s also ok to take it step by step and decide upfront what is critical to the current situation. However, having to go back and refactor too much code could outweigh the cost of taking the extra time for a proper approach.

That’s it for my brief introduction. I hope I’ve given sufficient information to at least raise awareness of everything required in the whole going-global process. There are a host of other options out there and I encourage exploration of them. The more we all know, the better off we will be when our own app needs to get its “passport”!