Server-side ad insertion with DRM

Server-side ad insertion (SSAI) in comparison with client-side ad insertion has the benefit that it makes it a bit more difficult for ad-blockers to block the ads without interrupting the playback, requires less performance from the playback device and provides a more smooth transition in and out of an ad break. One requirement that will make it more challenging to implement SSAI is when you are required to protect the streams with DRM. The situation today is that your SSAI solution need to support at least two different streaming formats and a number of DRM systems. In this blog post I will describe how this could be done.

Server-side ad insertion versus client-side ad insertion

Describing how SSAI in detail works is a blog post of its own but the principle is that the SSAI component takes a media stream and creates a virtual media stream for each unique viewer. The virtual streams are identical except when there is an ad break and just in time of the ad break the SSAI component communicates with an ad decision server that based on business logic decides which ad videos to be placed in the virtual stream for one particular viewer. All of this takes places on the server-side and the video player is playing this virtual stream as normal.

With client-side ad insertion we instead need two video players in the viewer application. One to play the media stream and another to play the ad videos. At the time of the ad break the media stream player is paused and the ad video player is communicating with the ad server which provides the ad video player with a location where the ad videos can be downloaded. The ad videos are downloaded and then played back by the ad video player.

An ad-blocker is an application that is installed on the mobile device or computer, and what it does is that it contains a list of Internet addresses of ad servers or ad video locations. In a client-side ad insertion scenario, the ad-blocker would prevent the ad video player to download these video files by simply blocking these addresses with the result that no ads are played back for the viewer. With server-side ad insertion the ad videos are by proxy delivered from the same address as the content, and even if it would be possible to somehow identify which part of the stream that are ads, blocking these chunks would mean an interruption of the playback.

With client-side ad insertion the mobile device needs to load two video players at the same time in memory and additional network requests (and potential round-trip latencies) for the communication with the ad server are added. On a thin device such as a Chromecast dongle it is not possible to load two video players in memory at the same time.


As piracy is a problem today we also need to protect the content and media streams with a DRM (digital rights management) system. Anyone working a couple of years in this industry know that the DRM support on the consumer devices are very fragmented today. Apple supports one DRM system, Google another and Microsoft a third. This in combination with the different streaming formats supported by the devices. The situation is a little bit better where Google, Microsoft and Samsung all support the same streaming format MPEG DASH and the exception here is Apple which only supports HLS.

So for example to reach consumers on Mac, PC, iPhone/iPad, Samsung phones (Android), Apple TV, Chromecast and Samsung TVs your SSAI solution need to support HLS with Fairplay DRM and MPEG DASH with Widevine and Playready DRM.

Because of how the SSAI component can in a scalable way create the virtual streams it is also required that the video player is capable of playing HLS with DISCONTINUITY tags and MPEG DASH containing multiple periods. These are the mechanisms in the streaming formats that makes it possible to replace video chunks in the original media stream with the ad video chunks.

So, let us have a look at the architecture for the use case where you have a traditional broadcast TV channel that you also distribute over the Internet with your OTT service (simulcast). The TV channel contains TV commercials that you want to replace with ads from your online ad inventory when watched via your OTT service.

Starting from the left in the drawing above we assume that the TV playout centre can signal where an ad break starts and the duration of the break. SCTE-35 is an existing standard that can be used to signal this information in a Transport Stream. Then when transcoding this video signal to the bitrates needed for the streaming formats it is important that this information is maintained.

The packager chunks the video stream into encrypted video file segments according to the different DRM systems. Fairplay requires one type of encryption while Widevine and Playready can use the same type of encryption (common encryption). The packager also creates the streaming format manifests. To signal where an ad break start and ends in the case of HLS it can use the tags EXT-X-CUE-OUT and EXT-X-CUE-IN, and for MPEG DASH it creates a new period. The video chunks and the streaming format manifests are uploaded to the origin.

The SSAI component then fetches the HLS-and MPEG DASH-streams from the origin and parses the streaming format manifests to be able to create a personalized and unique virtual stream for the viewer. The virtual streams also contains the necessary information that the DRM decryption module needs to be able to acquire a license from the DRM system.

When the viewer wants to watch this channel the video player requests this virtual stream from the SSAI component which then the video player starts playing. The video segments are downloaded and distributed over the CDN so it is only the streaming format manifests that the SSAI delivers to the player.

To summarize, to be able to have server-side ad insertion and DRM protection you need a DRM provider that can handle more than one DRM system and an SSAI component that can handle both HLS and MPEG DASH. If you don’t have the DRM requirement it is sufficient with an SSAI component that handles only HLS as most devices can handle HLS as long it is unencrypted.

If you have any further questions and comments on this blog drop a comment below or tweet me on Twitter (@JonasBirme).

Jonas Birmé is a Solution Architect at Eyevinn Technoloy. A Swedish based consultancy company specialized in the video and streaming technology.