How Netflix is using Aws to rule the Entertainment Industry: A Case Study!

Pushkar kumar
6 min readSep 26, 2020

--

Netflix is the world’s leading internet television network, with more than 100 million members in more than 190 countries enjoying 125 million hours of TV shows and movies each day. Netflix uses AWS for nearly all its computing and storage needs, including databases, analytics, recommendation engines, video transcoding, and more — hundreds of functions that in total use more than 100,000 server instances on AWS.

In the second quarter of 2020, Netflix generated total revenue of over 6.14 billion U.S. dollars, up from just over 4.9 billion in the corresponding quarter of 2019. The company’s annual revenue in 2019 amounted to 20.15 billion U.S. dollars, continuing the impressive year-on-year growth Netflix has enjoyed over the last decade.

Did you know in the 2008, Netflix was a victim of a major database corruption. Back in 2008, Netflix was majorly working on DVD-by-mail service. Due to the above mentioned database corruption incident, DVD shipping was disrupted for three days. Netflix management decided to move to the cloud, away from relational systems in their data centers. The shift happened from vertical scaling of particular failure points to horizontal scaling of distributed systems which were highly reliable. The cloud was that of AWS (Amazon Web Services) which offered the company the ability to scale as much as they needed.

Previously, Netflix team had to sit with their IT team to implement the scale up whenever their demand increased. Scalability was a huge issue with physical data warehousing. After shifting to AWS, scaling became seamless as petabytes of data could be used to stream videos within minutes, thanks to elasticity of the cloud. Based on user demand and with the help of AWS, Netflix could scale-up or down their data warehousing.

Netflix itself admitted that it would have been extremely difficult to scale so much on its own data centers. It was in the process of shifting its huge streaming operations to AWS for all these years. In early January 2016, Netflix shut down its last data center which was used by their streaming service. Now, there are eight times more users for Netflix as compared to those present in 2008. This represents the phenomenal growth of Netflix over the years. The company currently streams about 150,000,000 hours of video content per day. It serves around 86,000,000 members from 190 countries across the world.

Have a look at how video is delivered to users by Netflix: It is through Open Connect. It is Netflix’s own Content Delivery Network (CDN) which it manages through Amazon. Videos that stream to a user are located in data centers within the networks of Internet service providers, facilities where traffic is exchanged among most of the network operators. The traffic is distributed directly to Verizon, AT&T, Comcast, and similar network operators at such exchange points. When a user presses ‘play’ button, from these sites, videos get delivered to him.

Before a video gets delivered to him, operations like searching for videos and signing up by the user for the service are all handled in AWS cloud. Hence, the business logic, personalization, search, and data processing which gives the streaming experience are all live in AWS. The technology had to maintain Netflix’s employees who were working in streaming business, and this was also housed in Amazon.

Do you know why Netflix took seven years to shift to Amazon? They rebuilt their entire software platform to leverage AWS cloud network to the maximum. ‘Chaos Monkey’ is a series of tools developed by Netflix to reduce damage in the case of disruptions. On the Christmas Eve of 2012, the company suffered a streaming failure and at the time it was on a single Amazon region. Since then, they have invested heavily in disaster recovery.

Now, Netflix mainly operates in Oregon, Northern Virginia, and Dublin regions. What if one of these regions goes down? In this case, Netflix redirects the traffic to other available regions within a moment’s notice. The company has enough backups of all data which is stored in Amazon itself.

The distributed database, Cassandra is chosen to store customer data where every data element is replicated many times in production. The primary backups of all data are generated into S3 (Simple Storage Service). Any kind of operator errors, logical errors, software bugs, or other such corruptions can be dealt with by the S3 backups. ‘Armageddon Monkey’ is Netflix’s attempt to recover from failures of all its systems on AWS.

Netflix Architecture on AWS:-

Improving Customer Experience with Real-Time Network Monitoring

Netflix’s Amazon Kinesis Data Streams-based solution has proven to be highly scalable, each day processing billions of traffic flows. Typically, about 1,000 Amazon Kinesis shards work in parallel to process the data stream. “Amazon Kinesis Data Streams processes multiple terabytes of log data each day, yet events show up in our analytics in seconds,” says Bennett. “We can discover and respond to issues in real time, ensuring high availability and a great customer experience.”

Netflix is now able to identify new ways to optimize its applications, whether that means moving an application from one region to another or changing to a more appropriate network protocol for a specific type of traffic. “Our solution built on Amazon Kinesis enables us to identify ways to increase efficiency, reduce costs, and improve resiliency for the best customer experience,” says Bennett.

Although a streaming data solution is not new to the IT industry, it is an innovation in the networking space. “Netflix is heavily invested in AWS in part because it abstracts the underlying network, so we don’t have to deal with switches and routers,” says Bennett. “We’re monitoring, analyzing, and optimizing at a higher level of the stack — in ways we would never even consider if we were running our own data centers.”

Application Monitoring on a Massive Scale

Netflix uses Amazon Web Services (AWS) for nearly all its computing and storage needs, including databases, analytics, recommendation engines, video transcoding, and more — hundreds of functions that in total use more than 100,000 server instances on AWS.

This results in an extremely complex and dynamic networking environment where applications are constantly communicating inside AWS and across the Internet. Monitoring and optimizing its network is critical for Netflix to continue improving customer experience, increasing efficiency, and reducing costs. In particular, Netflix needed a solution for ingesting, augmenting, and analyzing the multiple terabytes of data its network generates daily in the form of virtual private cloud (VPC) flow logs. This would enable Netflix to identify performance-improvement opportunities, such as identifying apps that are communicating across regions and collocating them. The company would also be able to increase uptime by quickly detecting and mitigating application downtime.

Each log record carries information about the communications between two IP addresses. However, in a dynamic environment like the one at Netflix, where an IP address can float between applications from day to day or even minute to minute, IP addresses alone don’t have much meaning.

Conclusion

Today, Netflix is the 10th largest Internet company in the world. ‘Supporting such rapid growth would have been extremely difficult out of our own data centers; we simply could not have racked the servers fast enough,’ Netflix’s blog post says. It continues, ‘Elasticity of the cloud allows us to add thousands of virtual servers and petabytes of storage within minutes, making such an expansion possible.’ So, that is the power of Amazon Web Services propelling one of the most ambitious companies on earth, Netflix, into uncharted territory and runaway success!

--

--