Text-to-speech converter using AWS Lambda, Polly, and API Gateway

4 min readMar 7, 2024

In today’s digital age, Text-to-Speech (TTS) converters play a vital role in enhancing accessibility, efficiency, and inclusivity across various platforms by transforming written text into spoken words.

To create a TTS using AWS, we’ll need the following steps:

Create an IAM role that gives your Lambda function, permission to use Amazon Polly.
Create the lambda function that will handle the text-to-speech conversion.
Create an API Gateway to act as the entry point to the serverless application.
Deploy and test API.

Step 1: Creating IAM role:

Go to the AWS IAM console.
Click on “Roles” in the left sidebar.
Click on “Create role”.
For the service that will use this role, choose Lambda.
Attach the “AmazonPollyFullAccess” policy (or you can create a custom policy with just the permissions needed).
Finish creating the role.

Step 2: Create the Lambda Function

Go to the AWS Lambda console: Lambda Console.
Click on “Create function”.
Choose “Author from scratch”.
Fill in the details:

Name: TextToSpeechConverter
Runtime: Python 3.x (or language of your choice)
Role: Choose an existing role and select the role you created earlier.

Now update the Lambda code source with the following code:

import json
import boto3
import base64

def lambda_handler(event, context):
    try:
        # Get text to convert from request
        text = event['queryStringParameters']['text']

        # Initialize Polly client
        polly = boto3.client('polly')

        # Synthesize speech
        response = polly.synthesize_speech(
            Text=text,
            OutputFormat='mp3',
            VoiceId='Joanna'
        )

        # Return audio as base64 encoded string
        audio_stream = response['AudioStream'].read()
        audio_base64 = base64.b64encode(audio_stream).decode('utf-8')

        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'audio/mpeg',
            },
            'body': audio_base64,
            'isBase64Encoded': True
        }
    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps(str(e))
        }

Click on Deploy to finish deploying the function.

Step 3: Create API Gateway

Go to the API Gateway console: API Gateway Console.
Choose “HTTP API”.
Click on “Build”.
Configure your API:

Add integration type: “Lambda”
Give the API a name TextToSpeechConverterAPI
Lambda Function: Select the Lambda function you created (TextToSpeechConverter)

5. In “Routes” configure the resource path to /TextToSpeechConverter.

6. Configure integration target to TextToSpeechConverter.

7. Give a stage name. Stages are independently configurable environments that your API can be deployed to. Here we’re going to use test.

8. Review and create the API Gateway.

Step 4: Test your API

Test the API using curl command on linux
You’ll need the API ID, which you’ll find in the left section of API Gateway → APIs
Go to your terminal and send a GET request to your endpoint, here’s an example:

curl -X GET "https://dmobfl29v8.execute-api.us-east-1.amazonaws.com/test/TextToSpeechConverter?text=hello,%20this%20is%20a%20tutorial%20test!" --output output.mp3

-X GET: Specifies that this is a GET request.
"https://dmobfl29v8.execute-api.us-east-1.amazonaws.com/test/TextToSpeechConverter?text=hello,%20this%20is%20a%20tutorial%20test!": This is the API endpoint with API ID, stage name, and region. It includes the text query parameter with the value "hello, this is a tutorial test!".
--output output.mp3: This option specifies that the output of the request (the audio file) should be saved to a file named output.mp3 in the current directory.

Now, it’s time to listen to the audio!

That’s it, I hope you liked and have a good one!

Text-to-speech converter using AWS Lambda, Polly, and API Gateway

Step 1: Creating IAM role:

Step 2: Create the Lambda Function

Step 3: Create API Gateway

Step 4: Test your API

Written by Lucas Soldeira Ludicsa