Create a video from a Medium Post using Amazon Polly and Amazon Comprehend
This is the story that relates the code behind the Faceless Influencer


As discussed in the aforementioned post, this creator will make simple content by taking the text from the post, analyzing its entities, taking a screenshot and putting in all together in a video. Sounds easy, even a bot can do it.
Let’s see how.
Taking the text from the post
It just happens that Medium will provide a JSON for almost every url. You just need to append ?format=json at the end. So we build a post url using the username and the postId and with a simple GET request we get the post info.
Here you can find an example of the data returned by Medium on a post. Once the we have the data we keep only titles and paragraphs ignoring image captions, quotes, and some other types of text that Medium defines.
Cool. Now let’s look for entities in the text.
Analyzing post’s entities with Amazon Comprehend
Finding the entities is really easy, just a method call passing the text as a parameter. The next thing we do is to count the entities and entity types, for example in this post “Medium” could be an entity and “ORGANIZATION” could be its entity type.
Now that we have the entities, their types and counts we want to describe what we have found. We need to create a text that will be read by Amazon Polly so that we can add this narration to our video.
There must be better ways to build this text but I took only few hours to complete the whole project, so I this what I did. There are 3 methods that help to describe what’s found in the post. They are called describePostTexts, describeEntityCounts, and describeTypesCounts and they all work the same way: use a template to create a phrase from the given data.
An output example would be:
The post is titled “Faceless Influencer” and it reads as it follows: [… few paragraphs …]
Ok, now let’s analyze it.
We find a total of 2 entities mentioned. Mainly “youtuber” which appears 1 times, followed by “YouTube” mentioned 1 times. Some other mentions include plus many others.
Regarding the types of entities, it includes 2 organizations, and also some other 2 things.
That’s it for today folks. [… closing thoughts …]
That’s good for now. So let’s move on to the get the speech from this text.
Text to Speech with Amazon Polly
Again, this is just a method invocation to get the audio. Then what we do is to save it locally to an audios folder.
One thing to notice here is that we return the fileName or path for the saved file. This is required because we need it later when creating the video. Ok, so we have the audio that will be played on our video, but we are still missing the visual part, the image.
Taking a screenshot for our video
There is a very easy to use package called webshot. Give it a url plus a file path and after some time you can open a static print of the blog post. It really is just that: webshot(url, file, callback). Once more we keep the path to which the image was saved to that we can use it in our following command.
Creating a video from a single image and single audio using FFmpeg
Now that our youtubot has all the ingredients let’s put them in the mixer. FFmpeg is
A complete, cross-platform solution to record, convert and stream audio and video.
That and very simple to use too. All I had to do in order to create the video was to execute a command that takes the image and audio as an input and it created the video as long as the audio always showing the same image second after second.
You could get fancy in here and add several images from the post or even combine videos if you had a video of the post instead of a screenshot: something like a video of scrolling down through the post. Additionally you could combine several audio inputs adding background music or some effects, etc, etc. After all, our bot is a baby that still needs to learn a lot.
“And what is that magical command?” You may ask.
ffmpeg -loop 1 -i screenshot.png -i speech.mp3 -c:v libx264 -tune stillimage -c:a aac -b:a 192k -pix_fmt yuv420p -shortest out.mp4
To be honest I don’t really understand all the parameters or what they do, the important thing is to specify the image, the audio and the output, all of which I bolded for you to see. See the full code below for taking the screenshot and creating the video
Putting it all together
We’re almost done with our bot. Let’s wrap it up into a single call so that it’s easier to understand the things we do. This is the first code that was shown in this post, I’m adding it in here again because know you know how we got to it.
There you go. I told you it was simple, even a bot can do it. Let’s put it into test. The following video was created running this code.

All the code is available in this github repo, feel free to use it. And as the Faceless Influencer says:
That’s it for today folks. Let me know what you think in the comments. Support me on patreon and don’t forget to like this video and subscribe to my channel. Cheers and until next time.