Using Type Safety to Make Translations better

One of the reasons why we chose Scala at Foursquare is for its expressive type system. We can use the type system to eliminate large classes of bugs at runtime. Recently, we eliminated hundreds of bugs around translations by making our internal representation of translated copy more type safe.

The problem that we solved was an issue of automation. You see, translators need certain meta information related to copy in order to do their jobs well. Stuff like

  • Gender information
  • Pluralization information
  • Explanations of any placeholders

among others.

In the past, we had the actual copy definition separate from this meta information. The programmer would add copy to the codebase, then they would add all the meta information to a separate file. While doing code reviews, I noticed that engineers frequently forgot to edit the separate file, or they would remember, but wouldn’t add all of the information there.

This led to bad translations getting propagated to the end user…

Here, we had left off some pluralization information and ended up with the translation “1 tips”.

Here, we had left off some gender information, and called this user the male mayor “el alcalde” instead of the female “la alcaldesa”.

This is a snippet from an email that a user wrote to us. “Post” had been translated like “Post office.”

Getting Specific with the Problems

First, we wanted to automate the process of adding metadata to the separate file. We decided to do this by reading in the metadata directly from the scala source code and using the compiler to require the engineer to add the extra information. This would allow us to lean on the compiler with scala’s powerful type system and give us the ability to make things more type safe. It also makes things more automated and readable.

Building a Builder

Hence, we created the LocalStringBuilder.

This builder has 2 internal type parameters State and Goal. State is the current state of the LocalStringBuilder and Goal is the state that you want to end up with.

Here is an example of using the builder to construct a string to be translated:

LocalStringBuilder
.baseCopy("Your Swarm friend %1$s has checked in here")
.baseComment("%1$s is the name of a friend with unknown gender")
.femaleCopy("Your Swarm friend %1$s has checked in here")
.femaleComment("%1$s is the name of a female friend")
.maleCopy("Your Swarm friend %1$s has checked in here")
.maleComment("%1$s is the name of a male friend")
.result()

This particular copy is gendered, but the English copy is the same for all gender values. The translation is, however, different for gender values in Spanish:

When you first create a LocalStringBuilder, State is InitialState, which is basically the empty state and Goal is Basic, which has only the states that every piece of localized copy will need — what copy you want translated and an explanatory comment about the copy.

If you try to call .result on this newly created LocalStringBuilder, you will get a compile error, because State and Goal are not the same type.

You might be wondering why there isn’t just a State. When we call .result, we could just check to make sure that State has a comment trait and a copy trait and be done. We have the Goal parameter because the Goal state may change depending on what functions you call on the builder.

Recall that gender information is something that translators may need to translate a string correctly. For example, let’s take the English copy “She was all like…” This is text that appears before a tip that a female user left. We have different copy in the same circumstance for users who are men (“He was all like…”) and for users who did not specify their gender (“They were all like…”). In this case, the English copy is different for each gender, but there are many cases where the English copy is the same but will be different in other languages.

We have a .gender function on the LocalStringBuilder, but as you can see above, you must also provide the different variations of a piece of copy as well. So, if you call .gender or add specific male or female copy to the builder, the type of Goal changes from Basic to Gendered. Gendered has traits for female and male copy as well as for the specific gender at creation time. If you have at least one of these pieces of info but are missing at least one of piece of this info, the code will not compile.

There is a similar deal with the .pluralize function. You need to specify a singular version of the copy as well as the specific number that you’re pluralizing on. If you provide one without the other, the code won’t compile.

You can have a LocalStringBuilder that is both Gendered and Pluralized. This creates a new Goal called GenderedAndPluralized. This goal requires the extra information of singular copy for both the male and female cases.

In addition to all of these things, we also ensure that you don’t call the same function twice on the builder.

We can do all of these things using phantom types and implicits. Here is a code snippet for the function where we specify the gender of this particular copy:

def gender[S2](g: Gender.Value)(
implicit addGender: LocalStringBuilder.AddGender[State, S2],
checkNotAddedTwice: LocalStringBuilder.NotAlready[State, LocalStringBuilder.HasGender]
): LocalStringBuilder[LocalStringBuilder.Gendered, S2] = {
this.copy(genderOpt = Some(g))
}

The implicit parameter addGender takes State, and creates S2 = State with HasGender.

The implicit parameter checkNotAlreadyAdded checks to see if HasGender <: State. If this is the case, the code will not compile.

Note that the goal type has changed from Goal to LocalStringBuilder.Gendered.

At this point, we’ve gotten a bit deep into the code, so let me bring it back to the user. Remember at the top, when Swarm told me that I was a male Mayor? This was because the engineer had passed down the user gender via the gender() function, but failed to specify the separate copy for different genders. This lead to Spanish translators only receiving the neuter version to translate.

Now, this type of bug is impossible. The programmer cannot specify gender without providing the separate copy for translators.

Testing

You might be wondering how you test that this code is correct. Obviously, you can just add sample correct cases and if the code compiles, then you’re all good. But how do you test code that shouldn’t compile?

For this, we used something that was opened sourced by Foursquare a few years ago as part of the Spindle project. Essentially, we run a scala repl in our test code and check the output to ensure that we have a compiler error and it matches what we expect.

Conclusion

After implementing this change, we turned a bunch of user-facing bugs into hundreds of compile errors. Once the code compiled and tests passed, we were able to push this code to prevent uncountable future bugs. We would not have been able to do this with just any language. Scala’s type system has really been our secret super power for our internationalization pipeline. If you too are interested in turning bugs into compile errors, we’re hiring!

Maryam Aly the rest of the Foursquare #i18n team