From a Reactive to a Proactive Model, Service Assurance Is Transforming

Matthew Twomey
Anritsu Service Assurance
4 min readAug 28, 2024

This is from a piece published in The Fast Mode on the 2nd August 2024

(https://www.thefastmode.com/expert-opinion/36610-from-a-reactive-to-a-proactive-model-service-assurance-is-transforming)

During the Jurassic period, orb-web spiders began constructing 2D webs as a means to capture prey. Rather than proactively hunt for insects, the web causes the insect to become tangled. Vibrations occur through the insect’s struggle, which alerts the spider that an insect has been captured. As the insect struggles to free itself, the vibrations inform the spider of where the insect is in the web. At this point, the spider makes its way to the location and consumes the flying insect.

Much like the orb-spider, the early days of service assurance in telecommunications were characterised by using a reactive approach to issue resolution. It was only after receiving an alarm that the data, which had been deposited to centralised assurance, was analysed, and when necessary, a deeper dive into the issue was conducted by investigating data on the edge (on probes), and finally, the problem was resolved. While this two-step method works well for spiders, the wait-and-see, multi-step approach, combined with the time-consuming centralisation of network and subscriber traffic data, results in delays in the mean time to resolution (MTTR) for the issue. In addition to delays in resolving problems, operators are increasingly grappling with other considerations.

From not having sufficient data storage capacity or the budget to store the vast amounts of data needed for troubleshooting and analysis to today’s complex data protection requirements, the exponential growth of telecom data, and the increase in data speeds that has resulted in a surge in User Plane traffic; operators are in dire need of a solution to address growing challenges. A new approach that leverages artificial intelligence for IT operations (AIOps) has shown considerable promise in addressing these mounting issues. However, this innovative approach requires a radical change in thinking, retooling current practices, a commitment to innovation, and a willingness to embrace new technologies and methodologies. This innovative future of service assurance is best described as Octopus Assurance.

What is Octopus Assurance?

Octopuses have both a central brain and a network of nerves within and between the arms, which serves as a second brain (brachial plexus). While the octopus’s arms operate independently, they also function in collaboration with the central brain. For instance, when an octopus encounters a potential food source, it doesn’t take the time to communicate with the central brain. Instead, its brachial plexus makes the decision independently — on the edge.

Like an octopus making decisions on the edge, telecom edge assurance requires systems to be capable of automating decisions closer to the root of the problem. This concept involves service assurance or automation systems that can automate decision-making closer to the issue. However, the power to make edge decisions can only be fueled by machine learning (ML) and artificial intelligence (AI). Complex pattern recognition and a new form of communication between the central assurance are required to allow for faster issue detection, root cause analysis, and automated resolution. Edge assurance requires agents (edge agents) located away from the core and closer to network boundaries.

The inner workings of edge agents

Leveraging near real-time data and rules or references to validate their findings, edge agents proactively scan for anomalies and issues on the network’s edge. Issues they can have a positive impact upon range from simplistic to complex. A call success ratio (CSR) is an example of a simple problem where an edge agent can quickly track it to where the calls are failing in the radio access network (RAN). When the CSR in the access network falls below a predefined threshold (rule), the edge agent can trigger an alarm or independently attempt an automated resolution.

In the case of a more complex issue, imagine that an edge agent is tracking anomalous behaviour that simple rules can’t resolve. In this scenario, the edge agent can cross reference what it has recorded about the behaviour against a library of previous anomalous incidents through fingerprinting. If it finds a match, it can access the required automated resolution data (evidential data), resolving the issue faster than waiting for the aggregation, mediation, and correlation process of centralised service assurance.

Upon issue resolution, the edge agent shares its findings with other edge agents and crowdsources from them. Sharing this evidential data ensures all agents maintain the same data and, thus, the same level of intelligence.

Edge assurance relies on evidential data

As operators embrace differing levels of autonomous networks, resolution, automation, and orchestration systems will need evidence that a fix is possible or has been possible in the past.

Evidence falls within two categories:

  1. Fixes and automations for generic issues.
  2. Operator-specific fixes and automations.

The evidence itself is incident-specific but should at least include:

  1. A fingerprint (description) of the incident that can be cross-referenced against an ongoing issue.
  2. A template for the fix, such as reboot, redirect, configuration update, etc.
  3. A reference that allows the action system to understand if the fix has resolved the issue.

The future of service assurance lies in edge agents

The future of service assurance lies not in centralising data but in empowering edge assurance agents. Technologies (AI, ML algorithms, real-time analytics, and edge computing), combined with innovative processes, will allow issue detection, root cause analysis, and automated issue resolution to be handled faster and more efficiently than ever. These innovations give operators the power to ensure robust service assurance and thrive amidst the data deluge while enhancing the customer experience.

--

--

Matthew Twomey
Anritsu Service Assurance

Working in Telecoms for 25 years. Doing product marketing, marketing & sales enablement. Working in Service Assurance space for 20 years. Change is coming!