Generating speech audio from large texts using API. (Free)

In this tutorial we start with simple example use of Large Text to Speech API in python, using requests library.

Kristian Slosar
3 min readFeb 12, 2022

TL;DR: code is here or at the end of this tutorial.

Photo by Chris Ried on Unsplash

Start by importing required modules.

Next we define a text variable that we will synthesize using the API.
Notice that we define the text using triple “ and we don't have to care about end of line or other quotations in the text.

Our example text is quite short, but it can be of any* size!

Now, we define our endpoint url, payload, and headers.
In production, never expose your API key in the code!

Get your API key (free).

We can create our POST request. We receive back the unique id of this request which we will need later to get the status and url of our audio file.
We also get eta, which is the estimate (in seconds) how long the job will take, we can use it to wait that amount of time.

Finally, we create a GET request.
If our response object contains “url” key, we have the url of our file and we can save it.

Otherwise we need to wait more and GET it again.
If we receive “error” in our response we can print it.

Full code for your convenience.

Enjoy and have fun!

If you have any questions, let me know in the comments or here.

PS: *Actual limitations of the text size come from HTTP POST request size limitations of endpoints, for example RapidAPI maximum request size is: 50 MB. Tested with ~ 12000 words (72000 chars, 24 pages! ) which resulted in 1h 18min audio, completed in 16min

--

--

Kristian Slosar

interested in tech, javascript, python, ai, bots, ai, lean, startups and some more ai.