Design Decisions — Deadshot

🎥 Movie Streaming

Omar Elgabry

Published in

OmarElgabry's Blog

7 min readJul 20, 2018

Introduction

Deadshot is a movie streaming application, much like Netflix or Youtube.

Application features

Watch movies
Browse movies catalog
Recommended movie specific to user

Facts & Numbers

Streams around 10x millions of hours of movie every day.
Serves around 100x million users.
Operates in more than 100 countries.
Each movie is a few gigabytes of data.

Architecture Overview

1. CDN

It caches the movies, and also images, CSS and Javascript files.

The idea behind a CDN is to put movie as close as possible to users by copying the movies from central location (cloud storage), and storing on computers located all over the world.

By moving movie as close as possible to the users, the delivery of the movies will be as fast and reliable as possible.

When a user wants to watch a movie, find the nearest CDN with the movie on it and stream it from there.

Akamai is one of the widely used CDN services.

2. Load balancer

Load balancer will distribute the requests evenly among the (auto-scaled) web application servers.

3. Web app servers

These are the backbone of our architecture. They execute the core business logic, handle requests and sends back a response. One option is the AWS EC2.

They communicate with other components in the infrastructure to do the work such as databases, caches, job queues, cloud storage, etc.

4. Job Queues & Servers

Job queues store a list of jobs that need to be processed asynchronously.

We’ll do some asynchronous, intensive work at the background to process, that’s to encode and validate huge movie files in parallel.

The job servers are dedicated servers with high capabilities, can run on demand, and carry out the work by pulling jobs from the queue.

Some of the job queues are RabbitMQ, ActiveMQ, Kafka, Gearman, and Redis (can also be used as message broker).

5. Caching

We’ll cache most of the common user queries, and pre-generated HTML pages with expensive render blocks.

The most two widespread caching technologies are Redis and Memcache.

6. Data processing and analytics

The real-time user interactions are passed through a firehose to process the data; that’s to put the data in a standard format, and store it after being transformed in data warehouse to be analyzed to answer specific questions.

This is extremely useful in giving insights about user behavior and recommending movies to watch.

AWS Kinesis and Kafka are the two of the most common technologies for firehose, while AWS Redshift and Hadoop MapReduce are common solutions for data warehouse.

7. Database

That’s where we store data like users, and movie meta data. We can use relational database, like MySQL.

MySQL is really mature. Very solid. Free. Great community.

8. Cloud storage

A scalable storage to store the files (movies, images, CSS, JavaScript) while being able to interact with it via a RESTful API over HTTP. Amazon S3 is the most popular cloud storage.

Part I — From uploading till persisting

The movies source

So, first question is, where the movies originally will come from?. It will be uploaded manually by us.

We can get these movies from, say, online, production houses and studios, or any other source. The user can be also a source; A user can upload movies.

For now, let’s assume, we’ll upload the movies manually.

Encoding the movies

Before watching a movie, It must be:

Validated; color changes, missing frames or data transmission problems,
Encoded; converted into different formats for different devices.

A job will be inserted in the queue, and the job servers will carry out this process in parallel, asynchronously.

It’s clear that many servers are needed, with high capabilities to process these huge movie files in parallel.

Once the movie is validated and encoded, it’s stored in the cloud storage. The encoding process creates a lot of files to support different devices.

Part II— Delivering the movies

Ok. So, now we have the movies encoded and stored on a persistent storage. How they will be delivered to the users?.

Serving the movie

Using CDNs. That’s the name of the game.

Having many CDNs, distributed all over the world, in different locations, brining them closer to the users.

But based on what we’ll distribute the movies? Are all of them will be copied to all the CDNs? Some of them? Based on what? — you ask. Definitely we want to avoid unnecessary storage.

First Attempt …

Well, one option is to move most popular content and replicate content in multiple places.

How the most popular content in the CDN will be updated?

They might be cached for specific number of days, or hours.

But, If everything expires at one time, then, all movies will expire at one time!. Instead, we can randomly expire movies at random hours, say 30–50 hours.

Web servers periodically check for expired movies and push movies to CDN, or when there is a cache miss; user requested a movie that’s not in the CDN.

What about less popular content?

A movie may have a few plays, and because lots of movie are being played, so, CDNs won’t help here, and movies will be served from the disks blocks in our cloud storage.

Second Attempt …

Another approach is to analyze the data to get insights about our users. This is where the “Data processing and analytics” we discussed earlier comes into play. With that being said, we can know:

What movies users have watched, and when, and where.
What movies viewed but not watched.
How many times the movie has been watched.

With this kind of knowledge, we can predict what are the movies the users are likely to watch, and push that to CDNs. CDNs in different locations may have different movies according to users’ preferences in different areas.

But, what if the requested movie doesn’t exist in the nearest CDN?

If that’s the case, then one of the nearby CDN is always guaranteed to have it. When the user requests a movie, a list of CDNs will be sent from the web server.

What happens when a CDN fails?

The client (web browser or mobile) immediately switches to another CDN and resumes streaming.

What happens if there’s a high load on a CDN?

Same. The client the user is using will switch to a more lightly loaded CDN.

To summarize the process:

Periodically, the web servers send the CDNs a list of movies it’s supposed to have based on the user predictions we talked about earlier.
Web servers also check the health status and available assets in each CDN.

Then, when …

The user clicks on a movie to watch.
The client, say web browser, sends a request to the web servers indicating which movie the user wants to watch.
The web server identifies and returns a list of the best CDNs to use.
The client selects one of the CDN to use. It does this by testing the quality of the network connection to each CDN, and connects to the fastest, most reliable CDN. It also keeps running these tests throughout the movie streaming process.

Part III — Data & Cache

Database

The nature of the data we want to store in the database is not very frequent. We just store the user and movie meta data. It’s not twitter-like application.

Though, with large amount of data, we can’t fit into a single machine, we have to split them up; sharding.

We’ll assign users to different shards. All user’s data is on the same shard. No joins, and no shards talking to each other.

Each shard is a master-slave; a single master with multiple read slaves. The replication is for having high availability.

The master is multi-threaded, and runs on a large machine so it can handle a lot of work. Slaves are single threaded and usually run on lesser machines and replication is asynchronous, so slaves can lag significantly behind the master.
One solution is to write to the cache at the same time we write to the master database. Next request will get the data from the cache.
In addition to optimizing the queries, split them if needed, smaller queries can be processed by a replica very quickly. In between each query, we can pause and execute next query if current salve is not lagging above a specific margin.

Movie meta data on the other hand can be stored in one giant single table with less than 1x million records.

Querying the movie catalog, with many movies, and different filter options (genre, year, ratings, etc), would require indexes around the movie table columns to retrieve the results very fast; queries should be in few milliseconds.

Later, we might consider a search engine; index and store the movie meta data to be queried and filtered later.

Cache

It doesn’t go without saying, Cache.

But, what to cache?

Most common user queries on movie catalog.

We can also cache movie preview HTML pages, where you can find movie description, rating, image, any other related information. These pages won’t change frequently, and almost won’t change at all.

Part IV — The House (Data Centers)

To maintain high availability, we need to have multiple data centers in different regions.

If any region fails, and the other regions will step in handle all the requests. The logic for handling region failures is handled by us.

Data is copied to all regions, and users will be served from the nearest region (behind a latency-based load balancer).

Thank you for reading! If you enjoyed it, please clap 👏 for it.