5 ways of Improving Video Viewing Experience from a service monitoring perspective

Real-time analytics tools, Pre-launch Preparation, Enforcing Operation Procedure are key factors to deliver enjoyable viewing experience

tech@TVB
tech@TVB
6 min readOct 22, 2019

--

by John Au Yeung

Generally speaking, it is very different to monitor broadcasting TV and OTT video service. The primary difference between broadcasting TV and OTT video service is

Broadcasting TV focuses more on the transmission part while OTT service needs to cover the end-to-end delivery.

The end-to-end monitoring system includes the video delivery platform (video origin and CDN), various DRM, different streaming formats, different ISPs, linear & VOD services, various bitrates & resolutions, different subtitle languages, etc.

Broadcasting TV’s monitoring system is quite sophisticated nowadays, with lots of different monitoring tools available in the market, such as waveform tools for baseband signal level, phase, AV synchronization, etc. However, due to the complexity of OTT service delivery and with different viewing expectation around the world, the monitoring methodology & viewing standards are still at an early stage of development.

As for TVB, we have the vision and have endeavoured to achieve a similar viewing SLA as broadcasting TV. Various tools and monitoring system are set up in every integration point that help us to fast-track a problem before it spreads out. It also provides necessary supporting data to identify the root cause, helps to implement effective remedies.

In the following sections, we will look into different tools and methodologies are being introduced.

Monitoring System Evolution from Broadcast to OTT

As for monitoring of traditional TV delivery platform, generic monitoring points are video playback, compression, modulator and transmitters and endpoints of clients by regions.

With OTT platform monitoring system , which covers the end-to-end video delivery platform for multi-screens with various DRM, streaming formats, ISPs; this ensures the compliance with STB, IOS, Android and web players. Monitoring points need to be considered for HLS, MPEG-DASH, Smooth live streaming and VOD, each carrying various bitrates and resolutions, different subtitle languages etc.

Broadcast TV-Monitoring is typically done with manual monitoring tools only, such as setting up waveform tools to monitor baseband signal level, phase, AV synchronization, etc.

Now, let’s take a look at OTT TV-monitoring, it involves various performance tools where customized shell scripts will be used to monitor for example platform hardware, network traffic, streaming playlist, database, API etc.

As cloud systems become widespread which enables the development of applications and services APIs, it can be more complex to manage and correlate with them in the monitoring system.

A sturdy monitoring system is essential to build in every integration point, it helps us to fast track a problem before it spreads out, and provides supporting data to determine the root cause of the problem and get an effective remedy.

End-to-End Delivery Diagram

How to improve SLA & customer experiences

I. One of our main missions is delivering the contents with high quality under stable environments from the headend to client side. Headend system consists of different components, and the receiving end uses different players and devices to playback the contents. To ensure audiences, in the range of millions to enjoy our TV service, we need to define comprehensive tools with the most reliable observability and traceability. In general, there is no single monitoring tool which can fit 100% for an OTT platform in the current market. An intelligent monitoring system is needed to customize and enhance the tools for observing the hardware, API service and applications of the distributed IT infrastructures. Also, we have to create specified scripts to monitor for each application with a dedicated workflow.

Key Monitoring Components

II. Problem management should be planned on every possible failure cases and involved tuning of infra enhancements and application for the preventative purpose. By Cloud technology, a container can be deployed and being managed a lot easier in the system recovery. However, new challenges that we face will be involved in complicated environments, the implementation of the best practices in problem management as described below are essential in recovery planning.

Workflow of Problem Management

III. Optimize the delivery network by analysing problems with an advanced set of metrics.

For OTT analytics tools which have advanced set of metrics, we provide a tailor-made solution by using multiple dimensional filters , which help to identify the issues effectively with the last mile information provided by the different ISP.

Now , we look into the example of metrics. As seen in the graph below,

  • Buffer ratio represents error-free video connection attempts that terminate before the first frame of the video is displayed.
  • Play failures are registered at player initialization when a stream is not joined.
Buffer Ratio in Different ISP
Play Failures in Different ISP

By setting up filters with metrics and have in-depth analysis of its output data, we can narrow down the range of problematic conditions that customer facing, and address issues with customer’s ISP for the improvement of customer network quality. Customer experience is very important to us and is our key differentiator, we have to retain the customer satisfaction at a high level by providing insights into the streaming quality and availability.

IV. Except for the deployment of backend systems and Apps developments, stressing test is an important step in APPS preparation for rolling out. We need to ensure the capacity of infra system such as DB, cache, APIs which can cope with user’s scale in the forecast. Based on the testing result, we may optimize infra capacities that we actually need, scale up or down with the needs under our design system. In addition, we can make good use of hardware and cloud resources to enhance deployment flexibility.

Stress Test before Service Roll Out

V. Changing Managements

It is our aim to have better control both infra and applications being changed in the production environment. By establishing control processes such as rolling out at non-peak hours, fast fallback procedures and contingency plans are implemented in order to minimize unexpected downtime.

Workflow of Change Management

Conclusion

With the increasing impact of the internet TV development, this has led the development strategy being modified with various technologies and features everyday. At the same time, business and revenue models are constantly changing in competitive markets. Systems components have to be able to adapt to these modifications. Monitoring systems must keep in tightly matching in every changing requirement to provide responsive, predictive customer care, and to utilize the right technology to create a better customer experience.

In today’s global marketplace, a company with OTT platform like us continue to evolve, we have been growing rapidly in Hong Kong and show our rising importance in overseas market. We continuously provide our OTT services to create innovation and maintain customer satisfaction. Keeping SLA in high availability is our ultimate goal, in addition to the above-mentioned implementation efforts, we aim to deliver all contents with stability, reliability and good quality. In the coming post, we will let you know more about our observability tools which have empowered to the teams.

We always been forward-focused and used technology to help improve the customer experience. If you are interested in joining our team, please get in touch with us at tech@tvb.com.

--

--