Text-to-speech converter using AWS Lambda, Polly, and API Gateway
In today’s digital age, Text-to-Speech (TTS) converters play a vital role in enhancing accessibility, efficiency, and inclusivity across various platforms by transforming written text into spoken words.
To create a TTS using AWS, we’ll need the following steps:
- Create an IAM role that gives your Lambda function, permission to use Amazon Polly.
- Create the lambda function that will handle the text-to-speech conversion.
- Create an API Gateway to act as the entry point to the serverless application.
- Deploy and test API.
Step 1: Creating IAM role:
- Go to the AWS IAM console.
- Click on “Roles” in the left sidebar.
- Click on “Create role”.
- For the service that will use this role, choose Lambda.
- Attach the “AmazonPollyFullAccess” policy (or you can create a custom policy with just the permissions needed).
- Finish creating the role.
Step 2: Create the Lambda Function
- Go to the AWS Lambda console: Lambda Console.
- Click on “Create function”.
- Choose “Author from scratch”.
- Fill in the details:
- Name:
TextToSpeechConverter
- Runtime: Python 3.x (or language of your choice)
- Role: Choose an existing role and select the role you created earlier.
Now update the Lambda code source with the following code:
import json
import boto3
import base64
def lambda_handler(event, context):
try:
# Get text to convert from request
text = event['queryStringParameters']['text']
# Initialize Polly client
polly = boto3.client('polly')
# Synthesize speech
response = polly.synthesize_speech(
Text=text,
OutputFormat='mp3',
VoiceId='Joanna'
)
# Return audio as base64 encoded string
audio_stream = response['AudioStream'].read()
audio_base64 = base64.b64encode(audio_stream).decode('utf-8')
return {
'statusCode': 200,
'headers': {
'Content-Type': 'audio/mpeg',
},
'body': audio_base64,
'isBase64Encoded': True
}
except Exception as e:
return {
'statusCode': 500,
'body': json.dumps(str(e))
}
Click on Deploy to finish deploying the function.
Step 3: Create API Gateway
- Go to the API Gateway console: API Gateway Console.
- Choose “HTTP API”.
- Click on “Build”.
- Configure your API:
- Add integration type: “Lambda”
- Give the API a name
TextToSpeechConverterAPI
- Lambda Function: Select the Lambda function you created (
TextToSpeechConverter
)
5. In “Routes” configure the resource path to /TextToSpeechConverter.
6. Configure integration target to TextToSpeechConverter.
7. Give a stage name. Stages are independently configurable environments that your API can be deployed to. Here we’re going to use test.
8. Review and create the API Gateway.
Step 4: Test your API
- Test the API using
curl
command on linux - You’ll need the API ID, which you’ll find in the left section of API Gateway → APIs
- Go to your terminal and send a GET request to your endpoint, here’s an example:
curl -X GET "https://dmobfl29v8.execute-api.us-east-1.amazonaws.com/test/TextToSpeechConverter?text=hello,%20this%20is%20a%20tutorial%20test!" --output output.mp3
-X GET
: Specifies that this is a GET request."https://dmobfl29v8.execute-api.us-east-1.amazonaws.com/test/TextToSpeechConverter?text=hello,%20this%20is%20a%20tutorial%20test!"
: This is the API endpoint with API ID, stage name, and region. It includes thetext
query parameter with the value "hello, this is a tutorial test!".--output output.mp3
: This option specifies that the output of the request (the audio file) should be saved to a file namedoutput.mp3
in the current directory.
Now, it’s time to listen to the audio!
That’s it, I hope you liked and have a good one!