Best way to manage videos at scale

Published in

Curofy Engineering

7 min readJan 15, 2019

In today’s world videos have become an easy way to explain any idea as well to understand any topic. Videos have proven to demand more consumer attention than any other medium. People like to access videos anywhere, anytime and it is a challenge for emerging companies to not only deliver quality content but also give a good viewing experience to their customers.

Introduction of our company -

We are a platform of 3 lakh verified doctors, largest in the country. As a fast moving company we experiment a lot with the product and at the same time be careful to deliver quality product for all. Recently, we have enabled our users to upload videos of surgical procedures etc., to share their findings and seek suggestions from their fellow doctors.

Most of our users, on the go, have minimum reception to faster internet speeds and this is a challenge when delivering heavy data content like videos.

We found it quite obvious to use the HLS protocol for video delivery as it is widely supported, developed by Apple and used by many big companies like Facebook. HLS stands for HTTP Live Streaming. It is a media streaming protocol to deliver video and audio content. An mp4 video is divided into small segments usually of 10 seconds and for every segment, multiple video quality segments are also created which can be downloaded for playback based on available internet bandwidth.

So whenever a player plays a HLS format video, it can request the video segment based on the available bandwidth and stream the video uninterruptedly giving the users a smooth experience. At one moment, users can be watching a low resolution video and in the next moment it can switch to higher definition as soon as the device is in the range of higher bandwidth network.

Below are the resolutions we used for our use case apart from the original resolution of the video -

The challenge was to transform the uploaded videos in the required format as well as generate the thumbnail for the placeholder of video. There are many service providers which provides video transformation and content delivery but the main problem is that they are affordable till a certain point and with the growth of company you need a solution which is not a burden on your pocket. AWS MediaConvert and S3 are the tools we choose to transform and deliver our video media. By using the solution discussed below we are able to reduce our costs by 90 percent which is quite significant for company of any scale. Follow through the article if you want to know how we zeroed on the solution.

Initial Approach

Instead of building an in-house solution we decided to use third party services for which Cloudinary fits our use case. It provides on the fly video transformation service which really helped in getting the transformation we needed and total time taken was also very less.

Development was faster as we don’t have to worry about the transformation. One has to upload the video, select the transformation needed and Cloudinary will do the rest. Initially, Cloudinary was well within our budget but as the feature started to gain more traction, more users started uploading videos. This growth spiked the amount we were spending on Cloudinary and it was time to search for other options.

Our experiments

Firstly, we tested the famous open source library — ffmpeg which is capable of doing the video transformation which we needed. We had two options of using ffmpeg library — use Cloud Functions or use one of the already running VM instance on GCP(Google Cloud Platform).

Since, the whole process works on an on-demand model, the latter choice was discarded. Cloud Functions is based on computing-on-demand model which means we will get the resources as and when needed without affecting our other services. As it is on demand we don’t have to pay the cost of the server when it is idle.

Cloud Functions are really easy to use and give the option of writing the code in two most popular languages — node and python. We decided to go with python as we are comfortable with the paradigm since our stack is on python. Installing the dependencies is really easy, you just have to mention it in requirements.txt file and you are set.

We got the script working for cloud function where we first download the video from cloud storage, transform it in required format and transfers it back to cloud storage for streaming.

But there was a catch. We let our users upload videos upto 100 MB and to handle video of that size cloud function was taking a lot of time and eventually terminates before completing the whole process because of the time limit (540 seconds). In our defence, we haven’t thought of that scenario earlier.

After exploring other options and lots of research AWS MediaConvert looked promising. Working with MediaConvert was tricky as it gives lot of options to customise your transformation and one can easily get lost in that situation (spoilt by choice). But after looking around for some time we got the template required for our desired transformation. (attach the template)

Our whole flow looked something like -

We upload a video to S3 bucket which in turn triggers a Lambda function. In lambda we figure out the the parameters required as input for MediaConvert using none other than ffmpeg. After calculating all the parameters, we submit a request to AWS MediaConvert to transform our video. To get notified on the transformation completion, one can create a cloud watch event of state change of job submitted in MediaConvert from ‘progressing’ to ‘complete’ which in turn will trigger a lambda function. The triggered lambda function can either make an api call or do a db update based on feasibility of your project.

In short :

The AWS solution worked like charm and it was really fast. This may look like a lot of work but it’s worth doing if you need to scale.

Benefits of using this solution -

We are using S3 to store and deliver our media content which is very scalable and affordable as well.
AWS MediaConvert is really fast and gives the functionality to maintain several queues to submit jobs for transformation. So you don’t have to worry about sudden jump in requests to handle videos. Also the pricing is really economical, no monthly charges and you have to pay only for the services used.
AWS Lambda is based on Computation on Demand which we have already discussed above. Apart from that AWS gives you some free lambda invocation and CPU time of Lambda function per month which is really great and even on usage beyond free tier pricing doesn’t put a burden on your pocket.

Problems faced during the AWS deployment:

Always remember AWS provides services based on the regions, except for few like S3 which is available on global level. Use services in the same region otherwise you may not be able to catch events or trigger lambda functions.
Give necessary permission to your IAM role for the services you are using.
As mentioned above video upload to S3 bucket triggers a lambda function where we fill in the necessary transformation details like the resolution of the resultant videos, bit rate, etc. and for these value you need to find the meta data of the video uploaded like height, width and bitrate. You need to use some video processing library like ffmpeg in order to get these value which is not preloaded on the container our lambda is running on as lambda runs on one or more containers that are created and deleted on demand as requests.
Unlike the Google Cloud Functions where you can just add the library name in requirements.txt and they will handle the rest, for AWS lambda you need to create a zip file of all the libraries and the binaries used in the lambda function. After doing some research we found out that you can add binaries of ffmpeg in /tmp folder and use it in your function. For steps to include ffmpeg in your path visit here and to create aws lambda deployment package visit here.
While testing we found out that the videos shot on iOS devices were getting rotated from portrait to landscape after transformation(StackOverflow discussion and AWS forum ). The reason for that is video shot on iOS device is originally saved in landscape mode with a flag in video meta data which tells the player to rotate video while playing. While doing the transformation the rotation identifier is lost and because of that video is played in landscape mode and look liked rotated.
To overcome this first we identified those videos using ffmpeg. Those videos have ‘rotation’ value in ‘Display Matrix’ of the video meta data which can be obtained using ffmpeg. Then rotate those videos based on the rotation given and now you have the rotated video which can be easily transformed using the above technique.

If you find this article helpful then please hit the clap button as many times you like. Please feel free to contact if you need any help in any step mentioned above.