How to integrate Google Cloud Text-to-Speech API into your iOS app

Google Cloud recently launched a new Text-to-Speech API that features over 30 voices, available in multiple languages and variants. The available WaveNet voices produce an extremely natural and fluent sound, but even the “Basic” alternatives sound surprisingly good. You can read all about it here and you can even try it out.

Why use Google’s Cloud Text-to-Speech service?

If you’re reading this, you probably already know the answer. The reason is superior sound quality!

Apple’s SDK already offers Text-to-Speech since iOS 7, and it can be used very easily, with just 4 lines of code.

But Apple’s sound quality is very bad in comparison, it does not use Siri’s voice, it uses other robotic-sounding voices. Even the “enhanced” voices, which the user has to download first, still sound pretty bad by today’s standards.

Before getting into the code

As with any Google Cloud API, the API has to be enabled on a project within the Google Cloud Console and all the API calls will be associated to that project. To setup a project in the Google Cloud Console, you can follow all the steps described here, except that this demo app requires an API key instead of a service account key.

Summarized steps:
1. Create a project (or use an existing one) in the Cloud Console.
2. Make sure that billing is enabled for your project.
3. Enable the Text-to-Speech API.
4. Create an API key.

Now onto the fun part…THE CODE 🙌

We will be creating a simple demo app with basic input controls.
Our app will have:
1. Text view — to enter the text that we want to convert to audio.
2. Segmented controls — to switch between the different voice options.
3. Speak button — to start the speech service.

Download the starter project from here, uncompress and open the project. After running it, you should see something like this:

Now let’s add the SpeechService class to our project. This class has all you need to communicate with the Google Cloud Text-to-Speech API. Each important piece in this file has comments explaining its purpose, make sure you go through them.

Before continuing, make sure you replace <YOUR_API_KEY> with your actual API key, created in the API enabling steps above.

The way you interface with the SpeechService class is as simple as:

SpeechService.shared.speak(text: “My text”) {
// Finished speaking

Now, finally, let’s use the SpeechService class on our “Speak” button press action. For that, we need to update the didPressSpeakButton function as follows:

Now, run the app and press on “Speak”. Did it start speaking? YAY!!!
We just converted the TextView text into audio and played it.

So, what’s next? As you probably noticed, our UI has other options that we are not applying. To be able to switch between the different voice categories and genders, and to also disable the “Speak” button while speaking, we need to update the didPressSpeakButton function again:

Run the app one last time. Try the different voice options (category and gender). Did it work? NICE!!! Now you have your own personal J.A.R.V.I.S.!…hmm, maybe not quite yet.

The complete project is available here.


Google Cloud Text-to-Speech API is very easy to use and integrate, and it’s quite capable with impressive audio results. But obviously, all those nice features don’t come for free, you can check the pricing details here, which I think are reasonable. If the audio quality is not a priority in your app, definitely use the built-in AVSpeechSynthesizer by Apple, super easy to use and free.

That’s all for now, thanks for reading! See you next time!




A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alejandro Cotilla

Alejandro Cotilla

Love tinkering with new technologies and building enjoyable user experiences.

More from Medium

Build an iOS framework for distribution with Bazel

Setup iOS Automation Test using Robot Framework and Appium

Dependency Management in iOS

Continuous Delivery of iOS applications using Fastlane