Introduction to Multicast ABR (M-ABR) — Where it works and where it totally fails

Last year (2016), Cablelabs published a very interesting document entitled “IP Multicast Adaptive Bit Rate Architecture Technical Report”, describing how to bring together two fundamental and previously incompatible network concepts: Multicast and Adaptive Bitrate delivery, in what it call Multicast ABR (M-ABR). But, does it make sense? If it does, on which use cases it works?

Let me spoil the surprise: It does make sense on one single use case. It spectacularly fails elsewhere.

To start with, lets have a look at the generic M-ABR architecture:

Multicast ABR generic Architecture - This version differs from the CableLabs version, as I focus on different aspects.
Multicast ABR generic architecture — This version differs from the CableLabs version, as I focus on different aspects.

CableLabs presented a new video path architecture entitled “IP Multicast Adaptive Bit Rate Architecture Technical Report”, intended to describe a video delivery workflow which improves unicast network traffic for video content using Multicast ABR, including not requiring changes on the current ABR video clients.

Why is it relevant? Well, for cable operators, live unicast video is the greatest killer for a DOCSIS network bandwidth, as it requires low packet loss a huge constant bandwidth, depending on either it is SD, HD or 4K video. This issue was only theoretical only a few years ago, until the consumption of live content started to ramp up with the increase of mobile tablet applications being made available from MSOs.

As the latests Nielsen reports describes, TV consumption outside the traditional TV set is now a significant slice of the overall minutes consumed, although not replacing the STB as the main device, but on top of it. Users started to consume content on their mobiles and tablets.

As cable MSOs are concerned, the not-yet-there possibility of having mobiles devices replacing STBs as a significant video consumption device is nearly the worst possible case scenario: traffic which was before consumed from a broadcast method (usually QAM, including switched video) are now consumed using unicast, and everyone knows how tight DOCSIS networks are in terms of bandwidth. DOCSIS 3.1 does introduce some significant changes, but to no avail to the M-ABR field. More on that later.

The end result of this movement is that traffic is now not proportional to the number of linear channels, but to the number of individual devices consuming it, even inside the customer’s homes. Usage of M-ABR, althoug limited to live content, aims to bring closer the current ABR devices, with the network effectiveness of multicast.

M-ABR to the rescue

So, how to prevent unicast vídeo from clogging a DOCSIS network? Transport it over multicast and actively convert it on the home gateway. It seems simple, but it’s far from it.

The Multicast ABR architecture aims to comply with a number of objectives, including but not limited to:

  • REQ1 — Decrease network usage from ABR content consumption;
  • REQ2 — Keep the ABR client devices unchanged

Let’s recall the M-ABR architecture:

Multicast ABR generic Architecture - This version differs from the CableLabs version, as I focus on different aspects.
Multicast ABR generic Architecture — This version differs from the CableLabs version, as I focus on different aspects.

Comparing to the traditional ABR video path architecture, M-ABR adds a couple of components:

  • Multicast Server — Has 2 main purposes: creating a carrousel with the ABR segments, on a technology similar to DSM-CC (although there’s no repetition of packets) and the retransmission of missing multicast packets.
  • Multicast to unicast proxy — This is a home gateway software module, which converts multicast content into ABR segments.

Using this architecture, a video GOP follows the following sequence of events:

  1. After being generated at the ABR-aware encoder, it’s sent to an already existing cDVR platform. Up until this point, nothing changes.
  2. After being ingested by the cDVR, the newly generated ABR-Segment is fetched by the Multicast Server. It’s worth noting that the Multicast Server is a regular ABR client, which means it fetches both manifests (or MPDs) and segments using the ABR rules. This has a significant side effect: it only fetches a segment after it’s made available on the cDVR, but most importably, it broadcasts that segment at the same rate as the ABR profile. So, for a 2 seconds segment, it delays the video (at least) by same magnitude as the segment duration as compared with the regular ABR client
  3. An end-user device requests a video segment, using the cDVR as Origin address. This request is captured by the home gateway, which does a number of actions:
  4. Joins the multicast group of the requested content and profile
  5. Until the first segments are totally received, it forwards all incoming requests directly to the cDVR, in order for the content to be played back as soon as possible.
  6. When the content segments start getting received, including any necessary error correction, the home gateway stops forwarding the requests to the cDVR and replies itself with the requested content.

If everything goes right, the end device is now consuming content without significantly incrementing network traffic on the DOCSIS network. And when we mean everything, we really means everything, namely:

  • Zero packet loss on the DOCSIS network
  • Zero packet loss on the home network
  • WiFi bandwidth is always greater than the higher ABR profile

In this case, the traffic generated by this live content will never get higher than the bandwidth used by the content highest profile. Any other user consuming the same live content on the same DOCSIS network will not cause any additional traffic, thus implementing REQ1 — Reduce network utilisation due to video unicast content.

But the world is not so rosy

The reduction on the network usage is only as great as the number of users consuming the content. If we take as an example Charter’s average number of users per node which numbered 500 as of Q4'16, and the almost 200 channels offered, things stop adding up. Why? Because the number of users per node is decreasing while the number of channels increases everyday. This means that the gains obtained from going from unicast to multicast gets lower and lower due to the lack of end users sharing the same streams. And this this assuming a permanent blue sky world.

But the world is everything but blue skies, as things start to get less rosy very quickly. To start with, lets get to what it means “ABR”. It means that the consumed bandwidth adapts to the E2E network conditions of the end user, and in this case the weakest link is the WiFi network on the customer’s home. WiFi is all too known for being unreliable and prone to packet loss. As such, every time a user drops from the highest ABR profile into a lower one, one of 2 thinks needs to happen: either the lower profiles are driven directly from the cDVR origin, or for each lower profile, a specific multicast group needs to be added to the network. If we take into account that most ABR sources uses 5 to 10 bitrate profiles, it’s easy to see how the gains from aggregating users diminishes, but network complexity increases exponentially.

If the network resiliency on the home network is flaky, on the WAN things are far better, but zero packet loss is not assured every time, and specially on DOCSIS network, although things are far better on DOCSIS 3.x than on previous versions. There are however methods to recover from packet loss from the multicast component, as there are no built-in methods to recover from packet loss. But fear not, as there are brilliant methods to recover from multicast packets going missing, without linearly incrementing network traffic. One of those methods is Nack-Oriented Reliable Multicast (NORM) which is one of the most innovative network protocols I’ve ever read.

But, regardless of how brilliant NORM is, from an E2E perspective, it has consequences. The home gateway needs to wait for all packets which are part of a segment to be received, with or without retransmissions, before being able to forward the whole segment to the end device. Neither DASH using MP4ff, nor Smooth Streaming support incomplete segments which causes the proxy component not to operate on a byte range method, where every received bytes are forward to the end device, to an old store-and-forward approach, where a segments needs to be fully received before it can be forwarded to the end device.

Finally, there a problem with REQ2, “Keep the ABR client devices unchanged”. On most, if not all, live ABR client implementations, a number of segments are requested immediately, starting from the most recent segment, corresponding to the “now” segment. And here lies one the most fundamental problem: the current segment is *never* available on the multicast proxy when the ABR client requests it, because it’s still being broadcasted into multicast. However, it’s not only the current segment which is never available in time, due to the store-and-forward architecture of the multicast proxy, also the second to last segment is also not available in time.

The consequences of those missing segments is that segments will never be served from the multicast sources, but will always use the cDVR origin server, rendering the whole design totally useless. Unless, a change on the ABR client is performed, on which it ignores as last the 2 last segments. This said, it means that REQ2 “Keep the ABR client devices unchanged” wouldn’t be achievable, although only a small change would be required. However, there are two alternatives which can replace changes on the client implementation:

  • If you manage the origin server yourself, you can simple create a new manifest endpoint which is always kept 2 segments behind. In this way, the two latest segments would always be available on the multicast stream. Although this solution foregoes any changes on the client device, it significantly adds complexity on the video path. It also doesn’t affect the delay, as it’s kept exactly the same.
  • If you don’t manage the origin server, you can simple outright hack the manifest, by proxying it, and apply the equivalent to a 2 segment delay. It isn’t nice, it only world while the segment is not using SSL, but it works. It’s actually simpler to implement than to do it on the video path, although it’s a hack.

What about the trade-offs ?

Merging multicast and unicast has been the whole grail of networking research ever since multicast was invented, and the reason why it took so long it because there are trade-offs, which in the case of live video, are very significant trade-offs. Starting with the original diagram of the architecture, lets add an estimation of the delay each component bring to the video workflow:

Multicast ABR generic Architecture - Including the delay added by each component
Multicast ABR generic Architecture — Including the delay added by each component

Keep in mind we’re talking about live content, and there are two main types of live content: live sports and most everything else. No one cares if your favourite series’ episode has 10 or 20 seconds delay between users using different reception methods (Over the air, Cable STB or a tablet). Few people will even be bothered if the news start at 4:00:15 instead of 4:00 sharp, or if you’re still going for the champagne bottle while your neighbours are already popping the cork.

Using Apple's HLS test streams, the effect of having a 12 second delay gets quite obvious
Using Apple’s HLS test streams, the effect of having a 12 second delay gets quite obvious. On the left, what a broadcast user will see on it’s STB. On the right, what a user of M-ABR will also see. The broadcast user will always be at least 12 seconds ahead of the M-ABR user.

However, customers will get extremely upset if the neighbours scream “Touchdown!” (in the the US) or “Goal” (elsewhere), but on their screen the ball is still on the midfield. This is the most differentiating characteristic between different IPTV implementations.

The delay is the elephant in the room whereas M-ABR is concerned. OTT users are used to it from day one, as most OTT services already have a significant delay as compared with the cable broadcast. If you look that the diagram above, delay starts right after the video leaves the broadcaster, but the scale of it differs significant depending on the platform.

Taking as baseline the Set Top Box, which is used to consume most high value content such as live sports, OTT and M-ABR architectures have increasing delays. For OTT services, delay is caused for both the cDVR component, which by design cannot be lower than 1 segment, but most implementations impose a 2 segments delay, and by the ABR client implementation, which requires at least 2 segments to be downloaded before starting playback. In total, it means that most OTT implementations have at least a 4 segments delay or around 8 seconds, as compared the with STB playing the same content. Again, this is irrelevant for non live content, such as movies and series, but critical for live sports.

When adding the M-ABR architecture on top of the OTT architecture, delay is even increased by at least 2 segments, totally 12 seconds. One segment is due to the segment broadcasting step and the other due to the store and forward architecture of the multicast proxy.

In sum, M-ABR works, but contrary to what cable labs describes, it’s never intended to replace the current cable broadcast. It’s only a method to avoid flooding a cable network with unicast packets intended for mobile devices. If that’s the goal, then it works, and it doesn’t significantly impairs the viewer’s experience. At least, significantly more then it already does.

Where it totally fails

There is a very good reason why M-ABR is defined by CableLabs. It’s specially related to the “cable” part. On cable, video is not broadcasted on multicast, at least not while DOCSIS 3.1 isn’t everywhere. It’s either on plain QAM or using DVB-C. So replacing IP unicast for IP multicast, if there’s enough users sharing the same content, it only makes sense.

However, if we’re talking about an already existing IP network, usually using IP multicast things are different, *very* different.

To start with, if your STB is already fed by the IP multicast, which on modern systems already include trick modes and pauseTV, and is protected by either FEC or retransmission techniques, there’s nothing to gain, only to loose, namely, complexity, additional network load and increased delay.

Even for current ABR clients, things aren’t clear. If the goal is to reduce the load your own ABR clients impose on your network, using your home gateway, then you have a far simpler solution: implement a multicast to unicast proxy on your gateway. As opposed to the M-ABR solution, where the home gateway converts a broadcasted packet into unicast, simply segment the multicast stream into a TS-based unicast stream, either directly using DASH or HLS, or on a more brute force approach, simply streaming multicast into HTTP.

The advantages: no delay, no added video path complexity, little to none client changes.

Then, there’s a lack of need to it. On most current IPTV networks, the problem isn’t bandwidth, at least on the last mile. The problem is the limited split GPON offers, which is mostly limited to 1:64. This means that on your worst case scenario, each user gets at least 38Mbps of unicast traffic, enough for any streaming video, including 4K HDR.

Conclusion and final remarks

There is now a movement trying to bring M-ABR as a standard, as the current CableLabs document only describes a generic architecture, either using ITU or DVB, which would allow interoperability between vendors offering M-ABR components, which is not the case today. For instance, NORM is fairly well defined, but the packaging method it uses is not defined, nor is the packaging format. So, even before thinking on boarding this ship, you0ll need to wait for the standardisation process to take place. Then, there’s the issue of need, or lack thereof.

Today, there’s only a working implementation of M-ABR, on Comcast, which is lightly described on this panel with a Comcast representative, and there’s good reasons why it’s the only one. It only makes sense on legacy cable networks. If you’re launching a brand new network, either cable of fiber, you’ll be streaming video over multicast from day zero, thus negating any of the benefits this solution would bring.

Originally published at Too many Bits, too little Bytes.