IMF: A Prescription for Versionitis

This blog post provides an introduction to the emerging IMF (Interoperable Master Format) standard from SMPTE (The Society of Motion Picture and Television Engineers), and delves into a short case study that highlights some of the operational benefits that Netflix receives from IMF today.

Have you ever noticed that your favorite movie or TV show feels a little different depending on whether you’re watching it on Netflix, on DVD, on an airplane or from your local cable provider? One reason could be that you’re watching a slightly different edit. In addition to changes for specific distribution channels (like theatrical widescreen, HD home video, airline edits, etc.), content owners typically need to create new versions of their movie or television show for distribution in different territories.

Netflix licenses the majority of its content from other owners, sometimes years after the original assets were created, and often for multiple territories. This leads to a number of problems, including receiving cropped or pan-and-scanned versions of films. We also frequently run into problems when we try to sync dubbed audio and/or subtitles. For example, a film shot and premiered theatrically at 24 frames per second (fps), may be converted to 29.97fps and/or re-cut for a specific distribution channel. Alternate language assets (like audio and timed text) are then created to match the derivative version.

In order to preserve the artist’s creative intent, Netflix always requests content in its original format (native aspect ratio, frame rate, etc.). In the case of a film, we would receive a 24fps theatrical version of the video, but the dubbed audio and subtitles won’t necessarily match, as they may have been created from the 29.97fps version, or even another version that was re-cut for international distribution. We’ve coined the term “Versionitis” to describe this asset-management malady.

Luckily, the good folks over at SMPTE (whom you may know from ubiquitous standards like countdown leader, timecode and color bars, among others) have been hard at work, capitalizing on some of the successes of digital cinema, to design a better system of component-ized file-based workflows with a solution to versioning right in its DNA. If not a cure for versionitis, we’re hoping that IMF will at least provide some relief from this pernicious condition.

The Interoperable Master Format

The advance of technology within the motion picture post-production industry has effected a paradigm shift, moving the industry from tape-based to file-based workflows. The need for a standardized set of specifications for the file-based workflow has given birth to the Interoperable Master Format (IMF). IMF is a file-based framework designed to facilitate the management and processing of multiple content versions (airline edits, special editions, alternate languages, etc.) of the same high-quality finished work (feature, episode, trailer, advertisement, etc.) destined for distribution channels worldwide. The key concepts underlying IMF include:

  • Facilitating an internal or business-to-business relationship. IMF is not intended to be delivered to the consumer directly.
  • While IMF is intended to be a specification for the Distribution Service Master, it could be used as an archival master as well.
  • Support for audio and video, as well as data essence in the form of subtitles, captions, etc.;
  • Support for descriptive and dynamic metadata (the latter can vary as a function of time) that is expected to be synchronized to an essence;
  • Wrapping (encapsulating) of media essence, data essence as well as dynamic metadata into well understood temporal units, called track files using the MXF (Material eXchange Format) file specification;
  • Each content version is embodied in a Composition, which combines metadata and essences. An example of a composition might be the US theatrical cut or an airline edit.
  • A Composition Playlist (CPL) defines the playback timeline for the Composition and includes metadata applicable to the Composition as a whole via XML.
  • IMF allows for the creation of many different distribution formats from the same composition. This can be accomplished by specifying the processing/transcoding instructions through an Output Profile List (OPL).

The Composition Playlist

The IMF Composition Playlist (CPL) XML defines the playback timeline for the Composition and includes metadata applicable to the Composition as a whole. The CPL is not designed to contain essence but rather reference external Track Files that contain the actual essence. This construct allows multiple compositions to be managed and processed without duplicating common essence files. The IMF CPL is constrained to contain exactly one video track.

The timeline of the CPL (light blue in example) contains multiple Segments designed to play sequentially. Each Segment (dark grey), in turn, contains multiple Sequences (e.g., an image sequence and an audio sequence, beige), that play in parallel. Each sequence is composed of multiple Resources (green and red for image and audio essence respectively) that refer to physical track files, and subsequently, the audio and video samples / frames that comprise the overall composition. In the example above, light grey portions of the track files represent essence samples that are not relevant to this composition.

The flexible CPL mechanism decouples the playback timeline from the underlying track files, allowing for economical and incremental updates to the timeline when necessary. Each CPL is associated with a universally unique identifier (UUID) that can be used to track versioning of the playback timeline. Likewise, resources within the CPL reference essence data via each track file’s UUID.

Composition Playlist for Supply Chain Automation

The core IMF principles help realize a better asset management system. In order to achieve a higher degree of ingest automation for Netflix’s Digital Supply Chain, additional information needs to be associated with an IMF delivery and meaningful constraints need to be applied to the IMF CPL. Examples of additional information include metadata that associates the viewable timeline with the the release title, regions and territories where the timeline can be viewed, and content maturity ratings. The IMF Composition Playlist defines optional constructs that can carry such information thus enabling an opportunity for tighter integration with business systems of various players in the entertainment industry eco-system.

Anatomy of an IMP

Asset delivery and playback timeline aspects are decoupled in IMF. The unit of delivery between two businesses is called an Interoperable Master Package (IMP). An IMP can be described as follows:

  1. An Interoperable Master Package (IMP) shall consist of one Packing List (PKL — an XML file that describes a list of files), and all the files it references
  2. An IMP (equivalently, the PKL) can contain one or more complete or incomplete Compositions
  3. A Complete IMP is an IMP containing the complete set of assets comprising one or more Compositions. Mathematically, a complete IMP is such that all of the asset references of all of the CPLs described in the PKL are also contained in the PKL
  4. A Partial IMP is an IMP containing one or more incomplete Compositions. In other words, some assets needed to complete the composition are not present in the package i.e., some of the assets referred to by a CPL are not contained in the PKL Depending upon the order in which IMPs arrive into a content ingestion system, the dangling references associated with a partial IMP may be resolved using assets that came with IMPs previously ingested into the system or may be resolved in the future as more IMPs are ingested.

In relation to the example above, the indicated composition could be delivered as a single, complete IMP. In this case, the IMP would contain the CPL file with UUID1, image essence track files with UUID6, UUID7 and UUID8 respectively, and audio essence track files with UUID11 and UUID12 respectively.

The same composition could also be delivered as multiple partial IMPs. One such scenario could comprise an IMP1 containing CPL file with UUID1 and one audio essence track file with UUID11, and an IMP2 containing image essence track files with UUID6, UUID7 and UUID8 respectively and the audio essence track file with UUID12.

Case Study — House of Cards Season 3

Netflix started ingesting Interoperable Master Packages in 2014, when we started receiving Breaking Bad 4K masters (see here). Initial support was limited to complete IMPs (as defined above), with constrained CPLs that only referenced one ImageSequence and up to two AudioSequences, each contained in its own track file. CPLs referencing multiple track files, with timeline offsets, were not supported, so these early IMPs are very similar to a traditional muxed audio / video file.

In February of 2015, shortly before the House of Cards Season 3 release date, the Netflix ident (the animated Netflix logo that precedes and follows a Netflix Original) was given the gift of sound.

Unfortunately, all episodes of House of Cards had already been mastered and ingested with the original video-only ident, as had all of the alternative language subtitles and dubbed audio tracks. To this date House of Cards has represented a number of critical milestones for Netflix, and it was important to us to launch season 3 with the new ident. While addressing this problem would have been an expensive, operationally demanding, and very manual process in the pre-IMF days, requiring re-QC of all of our masters and language assets (dubbed audio and subtitles) for all episodes, instead it was a relatively simple exercise in IMF versioning and component-ized delivery.

Rather than requiring an entirely new master package, the addition of ident audio to each episode required only new per-episode CPLs. These new CPLs were identical to the old, but referenced a different set of audio track files for the first ~100 frames and the last ~100 frames. Because this did not change the overall duration of the timeline, and did not adjust the timing of any other audio or video resources, there was no danger of other, already encoded, synchronized assets (like dubbed audio or subtitles) falling out-of-sync as a result of the change.

To Be Continued …

Next in this series, we will describe our IMF ingest implementation and how it fits into our content processing pipeline:

by Rohit Puri, Andy Schuler and Sreeram Chakrovorthy

Originally published at on March 7, 2016.