Google Translate added support of 110 more languages

Catherine Chef
Startup Reviews
Published in
2 min readJul 1, 2024

Last week, Google announced that its translation service had added support for 110 additional languages. This expansion of service functionality was made possible thanks to the PaLM 2 large language model.

The list of new languages ​​includes, among others, Abkhazian, Bashkir, Buryat, Ossetian, Udmurt and Chechen. ¼ part is ​African languages. – Afar (Ethiopia), Nko (West Africa), Tamazight (Morocco), Tok Pisin (Papua New Guinea). The translator also learned to recognize some dialects, incl. Cantonese (China), Manx (Isle of Man) and Punjabi (India, Pakistan).

According to the company’s statement, in total, more than 614 million people speak the newly added languages ​​- that is, about 8% of the world’s population.

The developers noted that these 110 languages ​​are at different stages of use: for example, some of them are spoken by hundreds of millions of people, while others are already considered endangered and have almost no active speakers. In the latter case, supporting an endangered language with a translation service can help scientists and linguists work with ancient written documents or preserve these languages ​​as cultural heritage.

In a company blog post, Google software engineers pointed out that regional dialect variations and different spelling standards are taken into account when adding language support. In particular, many indigenous languages ​​do not have a single standard form, so it is impossible to create a universal “correct” text without taking into account a specific dialect. Therefore, if a language has many dialects, the developers try to determine which one is used most frequently and extensively, and then train the model to create text that is closest to that dialect. But at the same time, the model was partially trained with less popular dialects, which is why it can generate text with elements of different varieties of language.

The addition of more than a hundred rare languages ​​to Google Translate is part of the Google Languages ​​Initiative: in November 2022, the company released an article about the capabilities of AI, where it promised to develop a large language model that would support 1,000 languages ​​- with the caveat that this would take years of work. Taking into account the new addition, the translator can now work with 243 languages.

--

--