Google Translate Image Translation

Yoo Jin Shin
3 min readSep 15, 2017

--

Google translates between about 109 languages in text, images and more. The Google Translate app has saved many travelers from being lost due to the lack of knowledge of a language. It has mainly 3 ways of recognizing languages, being typed text, image and written text (on a drawing pad of a sort). Many users use Google Translate because of its ability to recognize text and image and written form, unlike some other translating apps.

It has good learnability (interface is kept simple, icons are recognizable, and there is a tutorial on how to navigate through the app. It also has good memorability which come from its minimalist interface and recognizable icons and accessibility, through its numerous language availabilities, its existence both on app and desktop, ability to recognize language, ability to recognize language in 3 different forms etc. Its efficiency, however, could be improved.

When translating from a photograph taken through the app (camera), the app tends to translate the whole photo, mark off word by word. Once a phrase or word is clicked so that its pronunciation is visible, it moves onto the translating page. When the user presses ‘back’ to go back to the image, the app loops back to the very first home page and not go back to the photo. Oftentimes this results in having to take a photo of the same page again several times, which takes up more time and battery life, and is unnecessary when translating several words from the same image.

Google translate identifies the different words or phrases that were identified in the photo
Once a word is selected, the translation will appear, but not how the word or phrase is pronounced in the original language
Now we get more details about the particular word (how it is said in its original language and how it is said in the translated language)
Once the back button his hit, however, we are led back to the first main page of the app, not the photograph. This is a problem when there are so many more words to be translated in the same image that was taken!

This could have been implemented with the thought that most users would only have to translate one word at a time (ex. travelers trying to figure out what a particular store sells) which would not require going back to the image several times. Another reason behind this decision could be the assumption that the person does not need to know the pronunciation of what is being translated, or that most languages are phonetic and do not require such help. However, in the cases of non-phonetic languages (like Chinese or Japanese Kanji that uses Chinese characters), this is not the case.

There are many times when whole pages have to be translated through image. This includes when the user is translating books, magazines, newspaper articles, restaurant menus, instructions on a manual etc. There are also times when people need to know how to say a certain phrase or a word, not just its translation (especially when using this tool to help learn a language). Looping back to the image and the results of the image rather than the very first page for translation once the back button is hit could be a good solution to this particular problem. Going back to the beginning page could be done by pressing the back button twice, or through a small icon that refers straight back to the ‘home’ page.

Another possible design change could be to have the phonetics show in the small pop-up window. This would reduce the need for users to go to the extended translating page unless it was for extended amounts of text. However, it is important to note that these design decisions are catered to two users in particular: users who need to translate certain words in larger bodies of text, and users who are translating languages with a non-phonetic written language.

--

--