Monitoring live streams at LaLigaSportsTV
Distributing live content for different sporting competitions is a challenge, but it is one that can ensure and control the quality of the experience, since on many occasions our service can broadcast more than 12 events simultaneously.
For this, in addition to the use of market-leading tools, we are also developing our own tools for the early detection and location of incidents in our video workflow.
In this article, we will share some ideas that we are implementing to help our operations team in the daily monitoring of activity.
Video streaming workflow
A typical live streaming workflow is composed of (at least) the following stages:
Input: This is the first stage of our workflow, here we receive the input stream of the live event. Usually this stream is not encoded by us, it can be received in many ways (satellite, SDI, IP, OTT …) but in our case the streams are received via the internet using protocols such as SRT or RTMP.
Encoding: In this stage, we do the transcoding of the input stream into different bitrates to generate an ABR stream so the player can select the right bitrate depending on different bandwidth conditions.
Packaging: In the packaging stage the different video bitrates and audio channels are “packed” into formats such as HLS or MPEG-DASH so they are prepared for being served over HTTP.
Origin: The content is served from our cloud using the standard HTTP protocol.
Delivery: Content is distributed around the world using distributed caches provided by CDNs.
Device: The content is received on the user’s device and is played by the player.
Maintaining the availability of our service is our highest priority. For our operations and quality control team, it is not only important to detect when an incident occurs, but also at what point in the workflow it has occurred in order to make decisions quickly and solve the problem.
To continuously monitor our QoE we use Youbora in our player. Youbora is a third party service that registers (practically) everything that happens in our player. This allows us to see if a problem has occurred, although in some cases it is not possible to see in which stage of our workflow the issue happened.
Our operations team will need to do carry out different tasks depending on the location of the issue. For instance, if a problem is detected in the input stage, the team will have to contact the event producer, or maybe take a look at the configuration of the network.
Manually monitoring so many concurrent live events is expensive and can lead to human errors, which is why we are working on the automation of the monitoring, building tools that constantly analyze live streams, frame by frame, and are able to notify our operators in the case any problem is detected.
Using PyAV to analyze and monitor live streams
PyAV is an open-source Pythonic binding for FFmpeg’s internal libraries (libavcodec, libavformat…). For those who don’t already know, FFmpeg is one of the most powerful open-source video tools, used by services such as Netflix or YouTube.
PyAV is a Pythonic binding for the FFmpeg libraries. We aim to provide all of the power and control of the underlying…
Thanks to PyAV we can work with video and audio at a lower level, controlling the input flow of frames, in this way we can build multi-thread applications that are able to connect to different streams and perform stream integrity analysis tasks, show real-time images of all signals (regardless of the format and protocol in which they are available) as well as detect errors that may impact the user experience.
PyAV allows easy access to the internal data of the video formats and protocols. It is also possible to process the audio and video frames and perform all kinds of checks.
To install PyAV you just have to type the following command:
pip install av
This command will automatically install the FFmpeg binaries and the PyAV library will be ready to use.
Connecting to a video stream or opening a video file is as simple as:
Once we have opened the stream we can start reading packets, one by one:
This is how to open an output video container to send a video somewhere. In the following example, we can create an output HLS format with a single H.264 video:
Then we just need a loop to send the frames of our video to the encoder:
Since PyAV also provides wrappers for FFmpeg’s libavfilter, it’s possible to make video compositions. By using overlay filter, for instance, we have implemented a class to compose video mosaics that are later served via HLS so they can be played in any standard player.
Here is how our mosaic looks. It’s worth noting that thanks to the power of filter graphs included in FFmpeg, we can draw things like text or VU meters.
In each step of the code, it is possible to detect exceptions caused by errors in the opening or processing of streams, so it is easy to catch the exception and inform the operators via Slack or email.
For all this to work, it is necessary to use Threads. Our system runs different threads that process in parallel; for instance, we have a thread for every stream that we read (SourceDecoder), a thread that performs the video encoding of our mosaic (EncodingThread), a thread that processes logs and messages (EventMessenger), etc…
Everything is connected using Python queues which are Thread-safe to avoid concurrency problems.
As they are independent Threads, the system does not stop if one of the input streams falls. In that case, its corresponding SourceDecoder does not send frames to its corresponding queue and the system can display information about the failure in the mosaic and warn the operators by sending slack messages.
Errors that we can detect
Stream loss: PyAV will throw exceptions like av.HTTPNotFoundError or av.EOFError if it finds problems while trying to open or read the stream.
Lack of audio or video tracks: PyAV will expose information about available tracks in the stream.
Poor Bitrate: Calculating video or audio bitrate is straightforward since PyAV exposes information about the duration and size of each audio or video packet.
Frame drops: Detecting frame drops is also easy to implement, all frames include PTS (presentation timestamps) values, so one just needs to read each PTS of a single track and determine if the increment of the PTS of the next frame is monotonic (the increment should be equal to the timebase of the track).
AV Desync: By reading the PTS of different tracks (audio and video) it is easy to determine if there is a gap between audio and video timestamps within the stream that could potentially cause AV desynchronization in players with small buffers.
What we have learned so far
This is a work in progress and during the development of the solution we faced problems, especially related to efficiency.
Image manipulation: In some cases, we need to scale images to compose our mosaic. Our first approach was using Pillow, the Python image manipulation library, but this is extremely slow and not suited for video. It is much faster using FFmpeg scale filters.
Buffering and synchronization: Since this is live video, we need to read the frames at a fixed frequency of 25 or 30 frames per second (depending on the input frame rate).
Complexity: If you just need to build a mosaic without the low-level video analysis and your live streams are standard HTTP streams, then HTML5 would be a better choice.
Challenges and improvements
Automatic monitor provisioning: We are working on providing the capability of automatic monitorization using Kubernetes when we start live video streams.
Black screen/silence detection: PyAV provides access to FFmpeg filters that can detect such events, but it’s also possible to implement it in Python by converting video frames to PIL Images or by decoding audio frames to PCM audio.
DRM Support: At the moment, the system is not capable of reading protected DRM streams so, in this case, we can only monitor streams from the INPUT and ENCODING stages.
Detecting artifacts: It would be nice to include functionalities that detect video artifacts by, for instance, adding OpenCV.