AIOps for Model Maintenance

Jishma🕸
fiolabs-datascience
6 min readMay 29, 2020

What is AIOps? Let me start by explaining ITOps. ITOps or IT Operations refers to all IT related operations needed for running a business. It is responsible for delivering and maintaining all the services, applications, and technologies for the same. However, due to the pace of today’s business and constantly changing technological landscape, traditional ITOps is no longer effective and is outdated.

This led to evolving a new model called DevOps. DevOps, as the name suggests, is a combination of software application development (R&D), quality assurance (QA), and ITOps, bridging the gap between development and operation team by automating build, test, and deployment of applications. It’s an alternative to the traditional approach of separating development and operational concerns.

What makes it different from conventional software development are the methods of Continuous Integration (CI) and Continuous Deployment (CD) which accelerated the process of application and service delivery.

With the explosion of volume and variety of data as well as an increase in containerized applications and micro-services, managing and resolving issues have become a complex task. Although you may have invested in monitoring, event management, and other things over the years, it’s making it increasingly difficult as your environment grows in complexity. These complexities and challenges are demanding IT operations teams to become not only more efficient but increase their agility. Tracking and managing this complexity manually is no longer possible. ITOps have been exceeding human scale for years and it continues to get worse.

To overcome this, a more agile and new model needed to be evolved in ITOperations. And thus came AIOps

AIOps

AI and ML have been the buzzwords for more than a decade and are making cutting edge frameworks in a wide assortment of areas. So why not incorporate AI to ITOps? This thought gave rise to AIOps which is basically AI for IT Operations.

The term AIOps was coined in 2016 by Gartner, a research and advisory firm. AIOps refers to the way IT organizations manage data and information in their environments using artificial intelligence (AI). It combines artificial intelligence and human intelligence to provide full visibility and to automate and enhance the performance of the IT systems that businesses rely on. It’s the future of ITOps.

The two main components of AIOps are bigdata and ML. To simplify, the idea is to collect data from a bunch of monitoring tools which rapidly increased over time and try doing pattern matching to isolate the root cause of problems.

In order to identify the root cause of an issue, the system should have seen that issue before and probably multiple times. This leads to stand-alone AIOps solutions. But now it’s possible to inform ML Algorithms to precisely pinpoint these problems & root cause as unified monitoring vendors have all the data and topology. In today’s complex IT environments this is what we need to find a needle in a haystack.

Visualisation of AI platform by Gartner

An AIOps platform bridges three different IT Operations:

  • IT Service Management (“Engage”)
  • Automation (“Act”)
  • Performance Management (“Observe”)

to accomplish the goal of continuous insights and improvements.

Promising AI-Ops products as per Gartner should have the following characteristics:

  • They should help reduce noise i.e., false alarms or redundant-events
  • Capture anomalies that go beyond static thresholds to proactively detect abnormal conditions
  • Extrapolate future events to prevent potential breakdowns
  • Provide better causality
  • Initiate action to resolve a problem

Some of the companies that provide AIOps solutions are:

  1. AIOps — BMC Software
  2. Moogsoft | IT Operations Analytics
  3. Loomsystems

So now the question is, Where does AIOps position itself or fit in the modern IT environment?

It’s important to know that AIOps doesn’t replace our existing monitoring, log management, or orchestration tools. It actually works along with them. So you could say that it positions itself at the intersection of the above tools and collects data from all of them.

The whole process will involve applying Machine Learning models to do the data analysis, create insights, predict potential issues, and possibly suggest means to fix them.

To bring such systems into action, Gartner suggests following optimizations that should be brought in to transform IT-Ops to AI-Ops:

  • Deploy AIOps by adopting an incremental approach that starts with historical data, and progress to the use of streaming data, aligned with a continuously improving IT operations maturity.
  • Select platforms that enable comprehensive insight into past and present states of IT systems by identifying AIOps platforms that are capable of ingesting and providing access to text and metric data.
  • Deepen their IT operations team’s analytical skills by selecting tools that support the ability to incrementally deploy the four phases of IT-operations-oriented machine learning: descriptive, diagnostic, proactive capabilities, and root cause analysis to help avoid high-severity outages.

Elements of AIOps

AIOps Elements
  • Data Sources- storing and managing data from various data sources like events, monitoring, tickets, logs, etc.
  • Real-Time Processing — software that accesses and preprocesses data from the data sources in real-time.
  • Rules and Patterns- software that can detect patterns from the preprocessed data to uncover irregularities and abnormalities.
  • Domain Algorithms- algorithms that allow the software to react automatically to insights gained from the data by detecting variations from normal behaviour and their causes.
  • ML & AI- enhance decision-making ability using Machine learning and Artificial Intelligence.
  • Automation — uses results from ML & AI to automate tasks so as to reduce the workload of IT operators.

Use cases

The best way to understand the implementation of AIOps is through use-cases.

  1. Anomaly Detection — Detecting anomalies by setting a dynamic baseline. Dynamic baselines allow AIOps tools to determine what’s a normal activity and what’s not.
  2. Root cause Analysis — It’s the task of determining the cause of a problem by tracing it to the source by using event correlation and log analytics in order to resolve it. It helps in reducing Mean Time To Repair (MTTR).
  3. Prediction- Detecting trends and predicting future developments and suggesting configurations that can help in improving the performance of the application and/or reduce the cost incurred.
  4. Noise Reduction and Alarm management — Differentiating signal from a flood of false alerts and giving intelligent alerts in case of anomaly detection.
  5. Intelligent remediation — Take automated action to resolve problems.

Below is an example from the Gartner Market Guide for AIOps platforms on how an IT organization could move through a staged use case adoption approach:

What are you waiting for?

We believe FIO Labs never fails to keep their promise when it comes to providing quality services. Our enterprise expertise and industry leadership mean you’re in safe hands.

If you are interested in learning more about what we do at FIO Labs or have some questions about this page, feel free to send us a message to contact@fiolabs.ai — we’d love to hear from you.

How to reach FIO Labs:

Leave a comment below | Book a FREE 30-min session for our on-going Pro Bono Services or Fill in our LinkedIn Form| Contact Us | About FIO Labs | Blog

References:

  1. Brink, M. (2019, December 2). 2019 Gartner Market Guide for AIOps Platforms. Retrieved from https://www.bmc.com/blogs/gartner-aiops-market-guide/
  2. Iauro. (2019, April 8). An introduction to AIOps. Retrieved from https://medium.com/@iauro/an-introduction-to-aiops-f3e063eaf1eb
  3. Cybulski, Z., Cybulski, Z., Stempniak, A., & Stempniak, A. (n.d.). What Is AIOps, BizDevOps, CloudOps, DevOps, ITOps, NoOps? A Gentle Introduction to Digital Business Transformation. Retrieved from https://stxnext.com/blog/2019/04/25/aiops-bizdevops-cloudops-devops-itops-noops-introduction-digital-business-transformation/
  4. Paskin, S. (2020, April 17). AIOps in 2020: A Beginner’s Guide. Retrieved from https://www.bmc.com/blogs/what-is-aiops/

--

--