R&D test protocol for peer-assisted video streaming

Sergey Arsenyev
Lumen Engineering Blog
6 min readJun 8, 2023

--

How we streamlined testing new features for our client-side module

In the world of peer-to-peer (P2P) video streaming, innovation and efficiency are key. At Lumen, we’re passionate about developing products that revolutionise this process. By decentralising media delivery, we’re lessening the burden on content providers’ servers and promoting optimal quality.

A P2P algorithm has a myriad of features that can be fine-tuned to different strategies:

  • the structure of the P2P network, be it mesh or tree, pull or push
  • scheduling strategy and congestion control that decides which peer to ask for a part of the video segment
  • video buffer level management, crucial in live streams, where we achieve P2P exchanges by setting varying buffer levels (thus, introducing buffer spread between viewers)

In this ever-evolving field, there’s no blueprint for which strategy will yield the best performance. Therefore, every idea needs to be tested. Full-scale A/B tests are usually the go-to, but there are pitfalls. They can impact users, which isn’t ideal. But most importantly, any idea needs to be first implemented in production-level code: the feature should be toggled, and the code should be reviewed, unit-tested, and pass internal QA. It can also become messy if multiple ideas are being considered in parallel: any new feature needs to be compatible with the already implemented ones.

What if we could know if an idea works quickly and discard it if it doesn’t? Enter our proposed solution: a dedicated testbed protocol to simulate real viewers. All it takes is a commit to a special git branch that triggers an experimental build (not a production build) of our P2P client.

This approach allows testing and discarding ideas in hours instead of the weeks otherwise needed for writing production-level code, reviewing it, and conducting an A/B-test on real viewers.

Setting up the testbed environment

The real challenge here is ensuring the testbed’s results mirror the impact on actual video viewers as close as possible. Here’s how we’ve set it up.

We simulate each viewer by opening a self-hosted page featuring a video player with the settings we’re testing. These can include player-specific configurations, the path to the experimental build of the P2P client, and a property name defining the P2P client-specific configuration. We inject the configuration via an HTTP request to our backend.

A testbed can run on either dedicated physical machines or in the cloud (e.g. using Chrome’s headless mode). Currently, we use five on-premises machines, with around 30 video pages in total (the page count should be not lower than the maximum authorised number of peer connections). Having the pages displayed on the screens has the advantage of easily verifying the setup, e.g. the synchronicity between the viewers which can be important for testing at reduced stream latency.

Our primary method of data collection involves the use of statistics payloads that our client-side module sends to our data backend. Alternatively, one can store console data in local files for each machine.

We employ a JavaScript code to open and close multiple pages on each machine. When a page closes, we open a new one, maintaining the realism of a continually changing viewer base. Pages open next to each other and do not overlap to avoid browser power-saving optimisation.

When it comes to stream selection for testing, everything hinges on the ideas we want to test.

For example, for testing reduced latency streams, we take the following criteria into account:

  • live streams with a sufficiently high bitrate (6 Mb/s) to challenge the P2P client
  • multiple bitrate tracks, all synchronised well with each other
  • short video segments (2 seconds)
  • support for range requests — to allow for segments only partially downloaded through P2P

We also often want to test multiple configurations one by one without human intervention. For this, the script automatically cycles through different page URLs, each representing a unique configuration. We typically run each configuration for a pre-defined period (usually two hours), which generally provides sufficient metric convergence.

Simulating realistic viewers

The lifespan of a page should reflect a realistic video session duration. To account for a constant probability of churn (viewer closing the video), we assign a random duration to each page governed by the exponential distribution with a mean duration of 5 minutes.

In our testbed environment, all machines share the same local network. This, while convenient, could skew the realism of P2P exchanges, making them faster than what you’d experience in real-world scenarios. To counter this, we introduce an element of throttling to the upload bandwidth with each peer randomly assigned to one of the 3 groups:

  • group A experiences throttling at 0.5 times the average stream bitrate, and comprises 50% of the peers
  • group B has throttling set at the average stream bitrate and represents 35% of the peers
  • group C (potential “top seeders”), the smallest group at 15% of the peers, experiences throttling at 5 times the average stream bitrate

Example of results

The image below shows the P2P network formed by the viewers simulated in the testbed. Each dot corresponds to a graph node, and each edge to a P2P connection. Such visualisations are helpful to study different P2P network structures and analyse the impact of a peer leaving the network.

Like in the real-world case, we observe some inequality: some nodes have many connections, while most have just a few. In this example, we have two isolated clusters (connected components), likely due to short-lived sessions that did not have time to connect to the main cluster.

When running testbed experiments, we focus on P2P metrics (offload percentage, the number of connected peers, and the fraction of seeders) as well as video quality of service (QoS) metrics (like buffering ratio and track switch count).

The plot below showcases the differences in P2P offload for varying configurations. By comparing three different configurations of the P2P module at different buffer spread values, we can discern which configurations hold the most promise. For instance, the graph reveals that configuration C could be discarded in favour of studying configurations A and B, as the former consistently lies below the latter two (except for the one point at an 8 second buffer spread, where convergence was not fully achieved after around 3000 payload samples).

Conclusions

In this article, we presented an innovative approach to testing new features in the P2P video streaming R&D process. The use of a dedicated testbed protocol simulates real viewers, swiftly assessing the viability of new ideas without the drawbacks of full-scale A/B tests. The environment closely mirrors real-world conditions, allowing for accurate and efficient testing.

This article is based on work done together with Ludovic Le Frioux and the Lumen CDN R&D team.

This content is provided for informational purposes only and may require additional research and substantiation by the end user. In addition, the information is provided “as is” without any warranty or condition of any kind, either express or implied. Use of this information is at the end user’s own risk. Lumen does not warrant that the information will meet the end user’s requirements or that the implementation or usage of this information will result in the desired outcome of the end user. All third-party company and product or service names referenced in this article are for identification purposes only and do not imply endorsement or affiliation with Lumen. This document represents Lumen products and offerings as of the date of issue. © 2023 Lumen Technologies

--

--