Getting Canopy to speak Chinese and German

Published in

Engineering @ Canopy

5 min readJan 8, 2018

Canopy is a B2B company headquartered out of Singapore, helping private banks and wealth managers increase traction with their customers.

As Canopy starts to be offered to clients around the world, we are keen to localise our platform. To kickstart the internationalisation efforts, we decided to start with multilingual support. In this post, we will outline our solution design, and the journey we took towards our goal.

The frontend team identified some goals for the first phase:

Only support non-RTL languages (RTL = right to left, e.g Arabic)
Automate the translations where possible to minimise human effort
Get everything non-English in prod reviewed by a language expert
Allow non-programmers to be able to maintain the translations
Extend the translations concept to other API consumers, such as our PDF reports

With a set of guidelines in place, we set out to pick our weapons and plan our approach:

Tools

Since we weren’t worried about RTL languages, we didn’t need to worry about CSS layout restructuring. We wanted dictionaries (a collection of key/value pairs) per language, which would help us pick the translation of a given term in the chosen language.

Google Cloud API provides translations, so we leveraged that to build our initial dictionary. Obviously, the nomads in the house weren’t shy about sharing some travel jokes where Google got the translations wrong. We didn’t want our customers to be subject to confusing translations, so we built a nifty in-house interface off Google Sheets API — for letting language experts review the translations and letting them amend the ones that don’t make sense. GitHub API helped with communication between the Spreadsheet and our repos to make sure our translations were always in sync. Our CI tool Travis finally runs the necessary scripts to execute last minute magic to get our production site to use the most fresh content.

Our frontend application is powered by the ambitious Ember framework. We decided to go with ember-i18n, an addon that could ease the management of our various language dictionaries. We had ~1000 strings to extract from the app, so we relied on ember-template-lint to help us with automating that internally.

Our intended flow for the entire translation process

First steps

We decided our first prototype would let one pick from 2 languages, and showcase the translated text (manual translations) to a few headers that we randomly chose. We had to:

Introduce a dropdown for the customer to choose language from
Move a few sample bare strings to the format our addon expected. e.g on the login template, Login to {{config.appName}} became {{t 'login.to' }}. login.to is the key in the dictionaries for every language - so if the user chose Portugese, the addon helps us look up the right dictionary and replace the string with the corresponding phrase in Portugese. We wanted Canopy to be called Canopy, of course, and not say, the Portugese word for "tree" - the addon provides for that
Set the default language to English — in the absence of the translation in our desired language, customers would still see the English word

… don’t let me lead you to believe nothing went wrong, and that we didn’t forget to do like, a couple of app restarts. Anyway, we did those, and you won’t believe what we saw next:

Our first prototype worked!

Soon we realised what we had gotten ourselves into. Some quick analysis revealed we had ~1000 strings across the app, and just the thought of manually moving them into dictionary got our lazy programmer muscles twitching…

So we spent the next day automating things and needed to write two scripts:

One involving a regex check to identify occurrences of bare strings in the app, and to convert them to the format the addon expects e.g as before, Login to {{config.appName}} became {{t 'login.to' }}
Another leveraging ember-template-lint to identify bare strings in the app, and extract them out into our English dictionary

Woot, woot! Our English dictionary was starting to get bigger! Now for the fun part: we extended google-translate to suit our needs, and retrieve our translations in bulk. And just like that…

Magic! It worked!

Everybody in the frontend team then got philosophical and took a minute to appreciate how powerful our technology ecosystem had become — we were now closer than ever to get our translations to life!

Next, we spent some time organising our script to fetch translations and classify them into different files by language. If frontend team had to maintain translations, we were basically done. One would go to the English dictionary, add a new key and run an npm script, which would do the needful for the other languages and create a GitHub Pull Request!

Some virtual beer cheer was exchanged on Slack and we were set for the final exercise.

To recap, at this stage, a language expert who’d like to review translations would need to understand the GitHub Pull Requests flow, and point out inappropriate translations that the programmers would then fix and push.

While we Canopy programmers love our GitHub features, we wanted to be able to provide language experts a tool they were familiar with — Google Sheets came to mind.

We knew Google Sheets had a robust API, so we knew what the next steps were:

Start a Google Spreadsheet, one per language. Stock keys in one column, and translations in another
Provide an interface for the language expert to publish changes made to one or more of the dictionaries
Trigger a pull request to the repository where all the dictionaries were being maintained

We did, and ended up with a nifty control panel to help with the publishing. The final interface looks as follows:

One can publish updates to just one language, or all of them at once

Our language expert now just needs to login to the Google Spreadsheet, pick the sheet corresponding to the language they’re interested in, and edit the translation for the key they choose to amend.

As a final step, we also moved our dictionaries and scripts to a common repository so our Rubyist friends driving the API team can take advantage of the effort put in so far. The API endpoints will soon support a lang param that will determine the language used for the results. The lookup will be done based on the same Spreadsheet — we’re going to maintain data translations in a single source.

That’s the story of how we’ve managed to find a robust, idiot proof approach to teach Canopy to fluently speak a few new languages — if Martin still remembers his German well, that is 😉

Getting Canopy to speak Chinese and German

Tools

First steps

Written by Sarup Banskota