How to Build a Language Translator with Text and Audio Using Python and Google APIs

Nikita Silaparasetty
5 min readJul 26, 2023

--

In just a few simple steps, you can create a working language translator that is efficient, flexible, and easy-to-use. Follow this tutorial to learn how.

One hand holds a paper consisting of one language, another holds a paper with another language

In today’s interconnected world, language barriers can hinder effective communication and collaboration across cultures and borders. To overcome these challenges, language translation plays a vital role in facilitating seamless global interactions. Language translation has become increasingly crucial in various domains, including international business, education, travel, and humanitarian efforts.

With the help of a language translator, we can bridge the gap between languages and enable efficient communication across diverse linguistic communities. Whether you’re a language enthusiast, a developer seeking to add language translation capabilities to your applications, or someone who doesn’t know the native language of the place you currently reside in, this tutorial will equip you with the knowledge and tools you need to build an effective and user-friendly language translator.

Let’s dive in and see how we can create a language translation tool using Python and Google APIs. You can find the full source code on GitHub.

Project Details

As mentioned earlier, we will be using Python to develop the language translator.

We will also be using the Google Translate Python library. This library provides a simple interface to the Google Translate API. It can be used to translate text between over 100 languages.

Additionally, we will be using ‘Google Text-to-Speech’ to convert our text into audio, ‘SpeechRecognition’ to make sense of the input, and ‘Pyaudio’ to help us work with our audio input and output.

Once we are done building our translator, it will be able to do the following:

  • Capture some input audio
  • Recognize the words spoken in the input audio and display them as text
  • Translate the text into another given language
  • Convert this text into audio and play it

For our project, we will use ‘Hindi’ (a common language spoken in India) as the input language and ‘English’ as the output language.

Installations

We will start by installing the following dependencies:


pip install googletrans
pip install httpx==0.22.0
pip install gTTS
pip install SpeechRecognition
pip install pyaudio

Import the Necessary Libraries

import googletrans
from googletrans import Translator # To detect and translate text

import speech_recognition as sr # To recognize speech

from gtts import gTTS # Google Text-to-Speech to convert text to audio

import os # To work with files

Exploring Google Translate

Before we begin the main project, let’s have a quick glance at what this module can do.

First, let’s have a look at the number of language that are supported by this library.

# Display the available languages

print(googletrans.LANGUAGES)

This will display each supported language along with their abbreviations. Next, we will find the number of elements in the list of supported languages.

# Display the number of languages supported

len(googletrans.LANGUAGES)

This gives the output ‘107’, which is the number of languages that Google Translate can support.

Google Translate can also detect languages from the text that is provided to it, as shown below:

# Detect Languages

print(translator.detect("Bonjour, comment tu appelle?"))
print(translator.detect("Kon'nichiwa, genkidesuka"))

This will give the following output:

Detected(lang=fr, confidence=None)
Detected(lang=ja, confidence=None)

As we can see, the program was able to detect that the sentences are in French (fr) and Japanese (ja) respectively.

We can then ask Google Translate to translate the given sentences:

# Translate languages

translator = Translator()
print(translator.translate("Salut comment ça va?"))
print(translator.translate("Kon'nichiwa, genkidesuka"))

The output of this code will be as follows:

Translated(src=fr, dest=en, text=Hi how are you?, pronunciation=None, extra_data="{'confiden...")
Translated(src=ja, dest=en, text=Hello, how are you, pronunciation=None, extra_data="{'confiden...")

In this way, with the help of the ‘googletrans’ Python library, we can easily perform several translation related tasks. We will now begin building our own translator using this module.

Language Translator with Audio

We will first initialize the microphone to capture audio:

mic = sr.Microphone()

We will now recognize the words spoken in the audio input:

rec = sr.Recognizer()

Next, we need to initialize the translator. After this, we can define the input and output languages, prompt the user to provide some audio input, and then use Google to recognize the words in the given audio, as shown below:

with mic as source:
# Initialise the translator
translator = Translator()

# Define the input language and output language
inp_lang = 'hi'
out_lang = 'en'

# Prompt the user to speak
print("Please speak now...")

# Calibrate to set the threshold property of the 'rec' instance to a lower value
rec.adjust_for_ambient_noise(source, duration=0.2)

# Keep recording until there is silence
audio = rec.listen(source)

# Use Google to recognize the words in the given audio
rec_aud = rec.recognize_google(audio)

The microphone will start recording and the following message will be displayed:

Please speak now...

The program will continue to record until there is silence. After it has finished recording, it will save the recorded audio as a ‘.mp3’ file.

We will now convert the audio input into text, translate it, and return the translation in text format:

# Print the input audio as text
print("Here is the audio input :" + rec_aud)

# Translate the text and display it
to_translate = translator.translate(rec_aud,src=inp_lang,dest=out_lang)
translated_text = to_translate.text
print("The translated text is: ", translated_text)
Here is the audio input :namaste aap kaise ho
The translated text is: Hello how are you

After this, we can convert the text into audio with the help of Google Text-to-Speech and play the output:

# Convert the text to audio and play it

speak = gTTS(text=translated_text, lang=out_lang, slow=False)
speak.save("recorded_audio.mp3")
os.system("start recorded_audio.mp3")

This will open a sound file containing the translated text in audio format.

Conclusion

Thus, by integrating the Google Translate API, we can effortlessly translate text between various languages, enabling seamless communication and collaboration in a globalized world. Moreover, incorporating the Google Text-to-Speech API enhances the user experience by transforming translated text into natural-sounding speech, making it even more accessible and user-friendly.

Additional Practice

We can modify this project in several ways as we practice building our own translator:

  1. Change the input and output languages to see how accurate Google Translate is.
  2. Provide different input each time to see how well the program is able to recognize and translate it.
  3. Try building the translator in such a way that it automatically recognizes the input language and translates it into a language of our choice.
  4. With extra skills, we can even turn this project into a proper web application consisting of a basic user-friendly GUI.

Have you tried building a language translator yet? What languages did you use, and how accurate was the output?

--

--

Nikita Silaparasetty

26 | Organiser, ‘AI For Women’ | Data Scientist | AI, Deep Learning Author | http://aiforwomen.org/