Netflix is now a global company, so we wanted to provide a viewing experience that was truly available everywhere even when the Internet is not working well. This led to these three prioritized download use cases:
- Better, uninterrupted videos on unreliable Internet
- Reducing mobile data usage
- Watching Netflix without an Internet connection (e.g. on a train or plane)
… So, What Do We Build?
From a product perspective, we had many initial questions about how the feature should behave: What bitrate & resolution should we download content at? How much configuration should we offer to users? How will video bookmarks work when offline? How do we handle profiles?
We adopted some guiding principles based on general Netflix philosophies about what kind of products we want to create: the Downloads interface should not be so prominent that it’s distracting, and the UX should be as simple as possible.
We chose an aggressive timeline for the feature since we wanted to deliver the experience to our members as soon as possible. We aimed to create a great experience with just the right amount of scope, and we could iterate and run A/B tests to improve the feature later on. Fortunately, our Consumer Insights team also had enough time to qualify our initial user-experience ideas with members and non-members before they were built.
How Do Downloads Work?
From an organizational perspective, the downloads feature was a test of coordination between a wide variety of teams. A technical spec was created that represented a balancing act of meeting license requirements, member desires, and security requirements (protecting from fraud). For Android, we used the technical spec to define which pieces of data we’d need to transfer to the client in order to provide a single ‘downloaded video’:
- Content manifest (URLs for audio and video files)
- Media files:
Primary video track
2 audio tracks (one primary language plus an alternate based on user language preferences)
2 subtitle tracks (based on user language preferences)
Trick play data (images while scrubbing)
- DRM licenses
- Title-level metadata and artwork (cached to disk)
We initially looked at Android’s DownloadManager as the mechanism to actually transfer files and data to the client. This component was easy-to-use and handled some of the functionality we wanted. However, it didn’t ultimately allow us to create the UX we needed.
We created the Netflix DownloadManager for the following reasons:
- Download Notifications: display download progress in a notification as an aggregate of all the files related to one ‘downloadable video’.
- Pause/Resume Downloads: provide a way for users to temporarily halt downloading.
- Network Handling: dynamic network selection criteria in case the user changes this preference during a download (WiFi-only vs. any connection).
- Analytics: understanding the details of all user behavior and the reasons why a download was halted.
- Change of URL (CDN switching): Our download manifest provides multiple CDNs for the same media content. In case of failures to one CDN we wanted the ability to failover to alternate sources.
To store metadata for downloaded titles, our first implementation was a simple solution of serializing and deserializing json blobs to files on disk. We knew there would be problems with this (many objects created, GC churn, not developer-friendly), so while it wasn’t our desired long-term solution, it met our needs to get a prototype off the ground.
For our second iteration of managing stored data, we looked at a few possible solutions including built-in SQLite support. We’d also heard a lot about Realm lately and a few companies that had success in using it as a fast and simple data-storage solution. Because we had limited experience with Realm and the downloads metadata case was relatively small and straightforward, we thought it would be a great opportunity to try Realm out.
Realm turned out to be easy to use and has a few benefits we like:
- Zero-copy IO (by using memory mapping)
- Strong performance profile
- It’s transactional and has crash safety (via MVCC)
- Objects are easy to implement
- Easy to query, no SQL statements
Realm also provides straightforward support for versioning of data, which allows data to be migrated from schema to schema if changed as part of an application update. In this case, a RealmMigration can be created which allows for mapping of data.
The challenges we had that most impacted our implementation included single thread access for objects and a lack of support for vectors such as List<>.
Now that the stability of Realm has been demonstrated in the field with downloads metadata, we are moving forward with adopting it more broadly in the app for generalized video metadata storage.
JobScheduler was introduced in Lollipop and allows us to be more resource-efficient in our background processing and network requests. The OS can batch jobs together for an overall efficiency gain. Longer-term, we wanted to build up our experience with this system component since developers will be encouraged more strongly by Google to use it in the future (e.g. Android ‘O’).
For our download use cases, it provided a great opportunity to get low-cost (or effectively free) network usage by creating jobs that would only activate when the user was on an unmetered network. What can our app do in the background?
1. Maintenance jobs:
- Content license renewals
- Metadata updates
- Sync playback metrics: operations, state data, and usage
2. Resume downloads when connectivity restored
There were two major issues we found with JobScheduler. The first was how to provide the updates we needed from JobScheduler on pre-Lollipop devices? For these devices, we wrote an abstraction layer over top of the job-scheduling component, and on pre-Lollipop devices we use the system’s Network Connectivity receiver and AlarmManager service to schedule background tasks manually at set times.
The second major problem we encountered with JobScheduler was its issue of crashing in certain circumstances (public bug report filed here). While we weren’t able to put in a direct fix for this crash, we were able to determine a workaround whereby we avoided calling JobService.onJobFinished() altogether in certain cases. The job ultimately times out on its own so the cost of operating like this seemed better than permitting the app to crash.
Playback of Content
There are a number of methods of playing video on Android, varying in their complexity and level of control:
Further, playback of offline (non-streaming) content is not supported by the Android system DASH player. It wasn’t the only option, but we felt that downloads were a good opportunity to try Google’s new Android ExoPlayer. The features we liked were:
- Support for DASH, HLS, Smooth Streaming, and local sources
- Extremely modular design, extensible and customizable
- Used by Google/OEMs/SoC vendors as part of Android certification (goes to device support and fragmentation)
- Great documentation and tutorials
The modularity of ExoPlayer was attractive for us since it allowed us to plug in a variety of DRM solutions. Our previous in-app DRM solution did not support offline licenses so we also needed to provide support for an alternate DRM mechanism.
Widevine was selected due to its broad Android support, ability to work with offline licenses, a hardware-based decryption module with a software fallback (suitable for nearly any mobile device), and validation required by Android’s Compatibility Test Suite (CTS).
However, this was a difficult migration due to Android fragmentation. Some devices that should have had L3 didn’t, some devices had insecure implementations, and other devices had Widevine APIs that failed whenever we called them. Support was therefore inconsistent, so we had to have reporting in place to monitor these failure rates.
If we detect this kind of failure during app init then we have little choice but to disable the Downloads feature on that device since playback would not be possible. This is unfortunate for users but will hopefully improve over time as the operating system is updated on devices.
Improving Video Quality
Our encoding team has written previously about the specific work they did to enable high-quality, low-bandwidth mobile encodes using VP9 for Android. However, how did we decide to use VP9 in the first place?
Most mobile video streams for Netflix use H.264/AVC with the Main Profile (AVCMain). Downloads were a good opportunity for us to migrate to a new video codec to reduce downloaded content size and pave the way for improved streaming bitrates in the future. The advantages of VP9 encoding for us included:
- Encodes produced using libvpx are ~32% more efficient than our x264 encodes.
- Decoder required since Android KitKat, i.e. 100% coverage for current Netflix app deployment
- Fragmented but growing hardware support: 33% of phones and 4% of tablets using Netflix have a chipset that supports VP9 decoding in hardware
Migrating to support a new video encode had some up-front and ongoing costs, not the least of which was an increased burden placed on our content-delivery system, specifically our Open-Connect Appliances (OCAs). Due to the new encoding formats, more versions of the video streams needed to be deployed and cached in our CDN which required more space on the boxes. This cost was worthwhile for us to provide improved efficiency for downloaded content in the near term, and in the long term will also benefit members streaming on mobile as we migrate to VP9 more broadly.
Many teams at Netflix were aligned to work together and release this feature under an ambitious timeline. We were pleased to bring lots of joy to our members around the world and give them the ability to take their favorite shows with them on the go. The biggest proportion of downloading has been in Asia where we see strong traction in countries like India, Thailand, Singapore, Malaysia, Philippines, and Hong Kong.
The main suggestion we received for Android was around lack of SD card support, which we quickly addressed in a subsequent release in early 2017. We have now established a baseline experience for downloads, and will be able to A/B test a number of improvements and feature enhancements in coming months.
Originally published at techblog.netflix.com on March 8, 2017.