The Challenges of Live Linear Video Ingest — Part One: Live Versus On-Demand System Requirements
By Allison Deal, Senior Software Developer
Prior to launching our live TV service last May, Hulu delivered on-demand content to millions of viewers, providing them with access to one of the largest content libraries and their favorite in-season shows next day. While our video ingestion system for SVOD is by no means simple, adding live TV to the mix added a layer of complexity and posed a unique challenge for the small but mighty video team here at Hulu.
Live TV requires systems that are vastly different from SVOD, and our team had to stand up a completely new stack to launch the live service. Existing SVOD systems also needed to be modified so we could support both offerings. In part one of this blog post series, I’m going to give a high level overview of the major challenges and requirements we kept in mind when designing our live video ingest system.
Building a System for Live Versus SVOD
The requirements for a live service vastly differ from the requirements for an SVOD service, and our existing on demand video and metadata ingest processing systems could not sufficiently support live linear stream ingestion for a number of reasons:
- Live systems have to be built with high availability and resiliency, as there is little flexibility to recover from lost data. For on demand video, content is generally received hours or even days before it is scheduled to be available to viewers, and if we experience an issue with files, we have time to ask content partners to modify or re-deliver the file. This gives us time to address any technical issues with content before it is scheduled to be available on the platform. However, this strategy is not suitable for a live service.
- On demand videos are ingested at the asset (episode) level, but a channel’s linear stream requires continuous 24/7 ingestion.
- On demand metadata is also asset-based, whereas live stream metadata is channel and timestamp-oriented. A program’s duration is pre-set for video on demand (VOD) assets and will not change, but live programming has flexible program boundaries that can change when long sporting events or breaking news occur.
In addition to addressing these challenges, we had a number of requirements and design objectives we needed to meet when building our new ingestion system.
Requirements and Design Goals
Our Live TV offering needs to deliver video to a variety of devices, which means the service needs to produce both Dynamic Adaptive Streaming over HTTP (MPEG-DASH) and Apple’s HTTP Live Streaming (HLS) media formats. At a high level, our system must ingest contribution streams from vendors, remux and repackage the video segments, and publish them to the CDN origin. In addition to the video workflow, the service needs to be capable of processing ad and program markers from the ingested stream to present live assets as programs (not simply as channels) with accurate start and end times as well as identify sections of the program that are ads.
Our live TV service should not have any down time, planned or unplanned. Video playback should be continuous and seamless for viewers.
Our video ingestion system needs to be compatible with the multiple vendors and networks that provide feeds of national and local channels across the country. It needs to accommodate vendor-specific encoding outputs, including varying segment lengths, segment naming schemes, and different resolutions and bitrates.
Content protection is critical for both MPEG-DASH and HLS formats. Our live ingest service needs to encrypt media segments with AES-128 based on MPEG Common Encryption (CENC) specification, package them in fMP4 and MPEG-TS file formats, and apply commercial DRM across all streaming formats.
The system needs to be fault-tolerant so that an issue or slowness with one channel will not affect ingest of any other channels and impact the experience of our viewers.
To provide viewers with a real-time experience, the system must be performant and process each segment under the time it takes for the average segment to be played back to the viewer in order to avoid large job queues.
The system has to be able to scale to simultaneously process thousands of channels, so that when our content partners grow, the system can handle the additional channels.
The system needs to be designed so that new features, such as dynamic ad insertion, processing demuxed audio and video input files, and processing different frame rates of video can be easily added, giving us the opportunity to bring new and expanded features to our viewers easily.
We took all these system requirements into consideration when designing and building our new video ingestion platform for live TV. In part two of this blog post, we’ll discuss how we designed and implemented the system with these parameters in mind.
Attending Grace Hopper this year? Come say hello and join me for an in-depth talk about our live linear video ingest system on Thursday, 27th at 1PM!
Allison Deal is a senior software developer at Hulu, specializing in video encoding and streaming technologies. She works on building and scaling the end-to-end live and on-demand video pipelines, with the ultimate goal of improving the playback experience for all viewers. She has been at Hulu for over three years, with prior stints at Rdio and Boeing, where she worked in Research and Development.
If you’re interested in working on projects like these and powering play at Hulu, see our current job openings here.