Announcing Zeno — Netflix’s In-Memory Data Distribution Framework
by Drew Koszewnik
Netflix’s Video Metadata Service (VMS) is the platform that supplies all of the data about our movies and TV shows necessary to drive the Netflix experience. This data ranges from video titles and synopses to the resolutions and bitrates of streams for playback.
In previous blog posts, we revealed how we manage this large volume of data efficiently for our high availability applications by loading it, in its entirety, directly into memory. Our rationale behind this architectural decision stems from the extremely low latency tolerance required to support the billions of requests Netflix serves per day.
Object Cache for Scaling Video Metadata Management
responding to service scaling challenges
There are two overlapping data sets which are managed by VMS in different ways:
- Data describing “things” (e.g. videos, genres, actors, streams, etc).
- The relationships between these “things”.
Previously, we described our use of NetflixGraph to represent the “relationships” data set above. In contrast, the “things” dataset we choose to represent as POJO objects. In this article, we will be introducing Zeno, the framework we use to distribute and keep up to date gigabytes of object data on thousands of servers across the globe. This post will show how Zeno:
- Resulted in an immediate memory footprint reduction of roughly 50% for the largest in-memory dataset on Netflix’s servers.
- Significantly reduced Netflix server startup times across the board.
- Reduced the development maintenance effort required to evolve our data model.
The challenges we address with Zeno are faced by many organizations. Today, we are open sourcing Zeno. If you face some of the challenges described in this article on your team, you may be surprised by the efficiency gains you can realize by applying this library.
What’s in the Box?
Zeno binaries are uploaded to Maven central, and the code is available for download from github. Zeno:
- Creates compact serialized representations of a set of Java objects.
- Automatically detects and removes duplication in a data set.
- Serializes data changes, so updates require minimal resource usage.
- Is efficient about memory and GC impact when deserializing data.
- Provides powerful tools to debug data sets.
- Defines a pattern for separation between data model and data operations, increasing the agility of development teams.
Zeno will create serialized representations, but some assembly is required: Zeno does not make assumptions about how the serialized data will be transported to consumers. This is by design; VMS is only the first use case for Zeno, and this scope limit opens up potential for a wider range of use cases.
Data Distribution Challenges
During its initial development stages, Zeno’s primary mandate was to improve the performance of VMS’s object data propagation. Our data set is small enough to fit in memory, but loading it on the Java heap and keeping it updated throughout the life of the server presents several problem-specific performance challenges:
- It should not take a long time to initialize the data set in memory. Long server startup times impact our autoscaling ability.
- Periodic updates should not cause significant spikes in server resource usage.
- Periodic updates will create long-lived objects, and since young space collection pauses are proportional to the amount of surviving data, this can lead to longer duration stop-the-world GC pauses during data updates.
- Transferring large amounts of data takes time, and our goal is to minimize the time between data becoming available in our source of truth, and the effects of that data becoming visible to customers using the service.
In order to understand what Zeno has to offer in this arena, let’s explore the current state of the VMS architecture.
VMS Data Distribution
The VMS architecture consists of a single server (with plenty of CPU and RAM), called the “data origination server”, which every few minutes pulls in all of the data necessary from multiple sources and reconstructs our video metadata objects.
The object data this server constructs encompasses all of the video metadata necessary to power the Netflix experience. Each time this data is constructed, we use Zeno to create a serialized representation, which we upload to a persistent file store (we choose S3 for this, but any technology can be leveraged for this use case). After upload, we notify all of the thousands of Netflix servers that new data is available. Each of the Netflix servers then downloads the data, and deserializes it using Zeno.
The actual transmission of the serialized data is out of scope for the Zeno framework. As such, use of Zeno does not require use of a persistent file store. However, we achieve fault-tolerance by using one. If the data origination server goes down for any reason, the serialized representation of our POJO instances is still available. The data across all Netflix servers will simply become “stale” until a new data origination server comes back online.
In the following two sections, we explore in more detail two specific features via which Zeno supports an extremely high level of efficiency for this data propagation strategy.
Deltas and Snapshots
Each time the data origination server constructs POJO instances they will be slightly different than the previous time they were constructed. These differences are usually very small relative to the entire dataset. We don’t want to ship the entire dataset in order to effect a relatively small change in the data — instead, we rely on Zeno to calculate a series of modifications to apply to the current data state in order to transform the data to the next state. We then ask Zeno to produce a small file describing these modifications, which we call a “delta” file.
Netflix servers which stay online for long periods of time can stay up to date indefinitely by using Zeno to apply these delta updates when they are available. However, due to both deployment and autoscaling, Netflix servers come up and shut down in AWS datacenters across two continents continuously. Newly started servers need a way to efficiently bootstrap their data.
To address this need, every so often the data origination server uses Zeno to create a file encompassing the entire data set. This larger “snapshot” file contains all of the data necessary to bootstrap to the current point in time. From this state, newly created servers can chain delta updates together to get to a later state.
The delta file approach minimizes the amount of data we need to distribute to many servers. We gain multiple benefits from this minimization:
- The data update thread competes for fewer of the servers’ resources.
- We create vastly fewer long-lived objects during data updates.
- The small “delta” files take very little time to upload, download, and process.
This last benefit is compounded by the distributed nature of the Netflix infrastructure. It takes a lot longer to copy data, for example, across the Atlantic Ocean to Europe than it does to copy data within the same data center. We therefore have our data origination server replicate each file (snapshot or delta) which it uploads. These files are uploaded to S3 buckets local to each AWS region in which Netflix servers operate.
Because the delta files are small, our data origination server can make them available in any AWS region worldwide just seconds after they become visible locally in the US. Consequently, even though our system is “eventually” consistent, our servers worldwide have the ability to keep their entire dataset in lockstep synchronization within seconds of each other.
Both the snapshot and delta files are encoded in a binary format we at Netflix refer to as “The Blob”. This name was used for lack of a better term during development to describe the format, and it stuck.
Now, with the open-source Zeno framework, anyone can take POJOs in any arbitrary data model and create their own blobs. Zeno will automatically calculate the minimum set of changes required to bring data up to date, and it provides convenient APIs to produce and consume deltas and snapshots. This method of data distribution, battle-tested at Netflix scale, is now easy to replicate for any data set.
Many of the objects in our POJO model on the data origination server will represent duplicate data. For example, the object representing the actor Tom Hanks who starred in Forrest Gump will be identical to the object representing Tom Hanks who narrated the conductor in The Polar Express.¹
All duplication in the object data set is automatically detected and removed when serialized using Zeno. Not only does this reduce the size of the serialized snapshot and delta files, but it also significantly reduces the size of the memory footprint of the data on Netflix’s servers.
Before Zeno, we spent a great deal of ad-hoc effort finding and eliminating the largest offenders of duplicate data. However, on our initial test of “blob-enabled” VMS we were very pleasantly surprised to see that complete deduplication resulted in the memory footprint of our data set being cut in half!
In addition, because data is already deduplicated in the serialized stream (and in addition because raw Zeno serialization/deserialization is extremely fast), our initial test of “blob-enabled” VMS showed reduction in initial deserialization times of between 75% and 90%. Since VMS initialization took the majority of server startup time for many of our applications, Zeno had a very significant positive impact on Netflix’s autoscaling ability.
Some detail about how data is deduplicated with Zeno is explored in the original internal presentation, most of which is now available for public review here.
So far, this article has discussed how Zeno enables efficient distribution of in-memory data. Zeno also enables the development of additional components which can operate on any data model.
Data Model Separation
The VMS object model itself evolves with our requirements. This evolution happens frequently and regularly; as we make improvements to the Netflix experience, we constantly need new data to drive new features.
In order to prevent this constant evolution of our data model from obstructing our rapid pace of innovation, the Zeno framework defines a pattern for the creation of components which can act on any data model without requiring awareness of structure or semantics:
- We define a set of NFTypeSerializers. These “serializers” serve simply as formulaic descriptions of the objects, they do not contain any logic.
- Operations extend a Zeno-defined interface SerializationFramework. These operations use the serializers to traverse objects in a data model.
When we define our “operations”, we are essentially describing how to traverse an instance of any arbitrary POJO class, and the actions to take when we get to nodes of different types. We might think of this like a SAX parser equivalent for POJO objects, as we define events to take as we encounter specific elements in the object instances.
Zeno ships with multiple framework implementations, and the patterns can be replicated to produce new operations as well. See the documentation for complete details.
The data ultimately represented in our object model comes from multiple sources:
- Some of it is manually curated
- Some is obtained from external sources
- Some is generated by automated systems
The VMS data origination server also applies business logic to combine and index this data.
Each of these steps is a potential source for data inaccuracy which can cause consequences on the Netflix site and devices ranging from displaying an incorrect release year for a title to being completely unable to watch a movie or TV show.
We have found that the ability to inspect the differences between two data states goes a very long way towards pinpointing errors in the business logic which builds our object data, and detecting pure data issues. Zeno includes a proven and reliable tool for understanding the exact differences between two arbitrary sets of data in just a few moments. Details about how to effectively use this functionality are described in the documentation.
The Zeno framework is a powerful component in the Netflix architecture. It provides the ability to efficiently propagate and keep up to date large datasets in RAM across many servers, and is proven to work at Netflix scale. It ships with data debugging tools applicable to any data set, and defines a pattern for separation of data model and data operations, which can increase the agility of development teams.
As of today, Zeno is available for application towards similar challenges your team may be facing. Zeno will continue to be applied and updated by the team at Netflix; if you’d like to reach us to talk about it, we’ve created a google group for that purpose. When we open sourced NetflixGraph earlier this year, we soon heard success stories from teams within Netflix and externally about efficiency improvements through its application towards different problems. We hope to achieve the same success with the Zeno framework.
It’s impossible to describe how satisfying it is to solve enormous scalability challenges, work with the extremely talented Netflix development team, and share responsibility for the design and control of the infrastructure responsible for serving 40 million Netflix subscribers. Does this seem interesting you? There are many more challenges where this came from. Check out jobs.netflix.com to explore our open positions.
¹ If you clicked the links above and got to the display page for these movies, I’ll give you one guess where the metadata used to draw these pages came from.
Auto Scaling in the Amazon Cloud
Since we began migrating our production infrastructure to the cloud in 2010, we have used Amazon’s auto scaling groups…
Originally published at techblog.netflix.com on December 5, 2013.