How to be data-driven when you aren’t Netflix (or even if you are) — Part 1

Axel Delmas
Lumen Engineering Blog
6 min readApr 5, 2018

Data analytics has always played a key role at Streamroot. From day one, our core goal was to improve quality of service for end viewers, as well as offloading as much bandwidth as possible from servers without compromising on QoS. That’s why we set out to implement a comprehensive analytics module built-in to our solution right from the start, at the prototype phase.

For those of you who are unfamiliar with our solution, Streamroot is a leading provider of innovative OTT delivery technologies for media groups, content publishers and enterprise customers. Our Distributed Network Architecture solution — Streamroot DNA™— prevents congestion and server saturation by offloading significant traffic off CDNs, offering broadcasters improved quality of service, greater audience reach and infinite delivery capacity at a reduced delivery cost.

The crucial role of data analytics

Our solution includes quite a few different algorithms, many of them distributed, and we realized quite early that modifications could have quite a different outcome out there than in our office environment. Real-life conditions can be tricky to reproduce, and there are infinite possibilities when there are millions of devices implementing your distributed technology. We realized pretty early that relying on intuition only was like trying to land a rocket on the moon with a walkie-talkie and a swiss army knife.

When it comes to improving our technology, we have a very iterative process. As appealing as the idea of crushing your enemy with a single blow may seem, improving a complex software is more like standing toe-to-toe against an army of small issues, advancing by baby steps. You don’t want to compromise all those hard won improvements by making a single bad decision and go back to where you were three months before. The only way to truly move forward is to validate each step along the way using comprehensive data analysis. You would probably like to do that too, when you make decisions and optimizations in your video workflow, like tweaking your ABR logic or stream packaging parameters to improve quality of service.

And quality of service is crucial for most video broadcasters. Whether you’re a SVOD platform that seeks to acquire and retain subscribers, or an ad-driven streaming service that is looking to increase viewer engagement, QoS is central in building revenue. Bad viewing experience will make your users leave. And they might never come back.

Building tools that allow you to be more data-driven and make informed decisions about QoS can be intimidating for broadcasters that don’t always have enough people or resources to invest, but it will pay off in the long run. And the good news is, it is possible to start slow and build your own solution incrementally. You don’t need to have a huge team of data scientists or set up a full fledged AB testing program to start reaping the benefits of analytics.

In this series of articles, I’ll cover our own experience building our in-house data solutions, and hopefully help you move forward with your own data-driven decision-making process.

The start of our quest — set up your data pipeline

The first step in our data odyssey was to build our data pipeline. Here is how it looked when we first started:

Streamroot’s client-side library runs a small module that collects all sorts of statistics, from P2P offload and QoS metrics, down to more granular metrics related to some specific P2P algorithms. It runs some pre-aggregations when applicable, then sends data payloads to our backend.

We wanted to be able to scale as we grow our business, so we set up an architecture that leverages a message broker. These data payloads are received by an HTTP endpoint that forwards them directly to our message broker (Kafka). This technology has been designed for a very high write rate, ingesting messages as fast as they arrive so no data is lost. It is perfect for our use case since we can suddenly have millions of viewers arriving at once, as for example at the beginning of a high-profile live event like the FIFA World Cup, which we’ll be powering for some of our tier-one customers this summer.

Once the data payloads have been secured in a message queue, they will be consumed by different services. We have an array of such “consumers,” which perform tasks such as enriching the data payload with metadata or running aggregations, then outputting the resulting data points in a database. Since we needed time series for our customer dashboard, we chose InfluxDB as a time series database. We’ve later introduced other technologies better suited to our constant improvement needs, but I’ll come back to that later in our series.

This was a first step; it provided us with basic insights on the performance of our solution, but but it had quite a few limitations.

Why a time series is simply not enough

When we first started with this data pipeline, we only had time series to represent our metrics. Whenever we pushed a new version or a configuration change, the only way we could verify its expected effect on traffic was to look at the time series, before and after the time of the change. This approach has a serious limitation: video data have huge variations due to environmental factors. We might see global changes in our P2P offload; however, it is almost impossible to tell with a great certainty that these effects are due to the configuration change, or the new version we implemented.

A good example of external factors that may obscure results while testing in a global environment is the state of the network infrastructure. Network conditions are highly variable: a couple of edge servers in your CDN might fail, or maybe there’s an exceptional traffic peak from a geographical region where CDN presence is more limited. All these will make their marks on your graphs, but they don’t necessarily mean your solution has gotten any better or worse due to the change you made in its configuration, or the new version you distributed. Some of those variations might be important to note in order to improve your CDN configuration, but for our purposes they’re mostly environmental noise.

With time series, we were able to have some level of understanding on how our solution was performing, but we could not validate whether the changes we were implementing were moving the needle in the right direction. We had to find a way to remove noise and environmental factors. This is when we started working on our AB testing workflow.

Part 2 of this series or articles will be dedicated to our AB testing process, so stay tuned for my next article here on the Streamroot Tech Blog.

Learn more about our data analytics at NAB

I’ll be talking about our data-driven decision making process next week at NAB. Drop by to say hi if you’re there:

Wednesday, April 11 at 11:00 am, North Hall Meeting Rooms — N255.

Also, I’ll be happy to see you and catch up at our booth SU9114 during the week. Hope to see you at NAB!

As usual, we are always looking for new tech talent to join our team, check out open positions on our career page or send us your CV directly to contact@streamroot.io.

--

--

Axel Delmas
Lumen Engineering Blog

Co-founder and CTO of Streamroot, a leading provider of peer-accelerated streaming and CDN optimization solutions.