Amazon Polly is a low-cost, easy-to-use Text-To-Speech API with impressive sounding voices. You get 5 million characters per month, for the first 12 months free. You can transcribe ~1600 average emails a month for free.
We can quickly get started with Polly using Node.js.
Download Sample Project
If you check out the AWS Node.js SDK documentation, they provide a sample project to get started with the SDK. We are going to use it as a start for this example as well. Clone it:
$ git clone https://github.com/awslabs/aws-nodejs-sample.git
Configure AWS Keys
We need to configure an AWS key/secret within our AWS account. If you don’t have them, take a look at this documentation. Once obtained, create a file in ~/.aws/credentials (C:\Users\USER_NAME\.aws\credentials for Windows users) with this content:
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
The credential file is what the aws-sdk uses by default to connect to your AWS account. Overall, be careful with these keys. They could do damage in the wrong hands.
Text-to-Speech to an MP3 file
The quickest way to use Text-to-Speech is to make the API request to Amazon Polly and write the contents to a file.
Now we can create a new file called amazon-polly-file.js:
cd aws-nodejs-sample && touch amazon-polly-file.js
Here is the code for amazon-polly-file.js:
You can run it with:
In this example, we are taking the text ‘Hi, my name is @anapfox’, sending it to Polly, and writing the contents to the file.
As you can see, we are setting the VoiceID to Kimberly. You can check out all the valid voices here.
Text-to-Speech to Speaker
For some applications, we want to perform the Text-to-Speech then send that directly to a speaker. We can use a node module called speaker. Speaker is just a writable stream that will play PCM audio to your speakers. Let’s add it to our application:
npm i --save speaker
Now we can create a new file called amazon-polly-speaker.js:
Here is the code for amazon-polly-speaker.js:
You can run it with:
In this speaker example, we are doing the same thing but creating a stream from the audio we get back from Polly. Then, sending that stream to the speaker module.
All Done 🤗
If I missed anything, feel free to reach out to me on Twitter.