The Startup
Published in

The Startup

AIOps — The Premise, Promise and the Prediction

Image Courtesy of Pixabay

“We are what we repeatedly do. Excellence, then, is not an act, but a habit”. ARISTOTLE

  • 71% of the enterprises believed they lack a meaningfully scalable model for infrastructure growth
  • 73% of the respondents had included and identified intelligent automation as a key theme for infrastructure management as part of broader IT adoption strategy.

Systems were simple, siloed and segmented

Image Courtesy of Pixabay

Descent into Chaos … Local scale 2 Planetary Scale

Image Courtesy of Pixabay

The Promise….Single Pane of Glass (SPOG)

Image Courtesy of Pixabay

Data as Crystal Ball Into Future State

Image Courtesy of Pixabay

AIOps Macro Trends — Possible Use Cases

Image Courtesy of Pixabay
  1. Prevent and Predict: An emerging use case is to predict the failure of the devops pipeline based on the release history, magnitude of changes and complexity of build etc. This avoids downtime toll as well as expensive regression testing.
  2. Anomaly/threat detection: Once the baseline behavior of the system is established, the AIOps tool watches for variance and flags outliers as they present. AIOps is a valuable addition to a strong security management posture. Heuristics and algorithms can mine network traffic or other threats that can take out a network. Subsequently if the anomalies represent the new baseline the mechanism allows it to update and revise its thresholds dynamically. This capability and subsequent use case is gaining wider traction due to rapid growth of cloud computing workloads.
  3. Event Correlation: Infrastructure teams are faced with floods of alerts, and yet, there is only a handful that are business impacting. AIOps can mine these alerts, use inference models to group them together, and identify upstream root-cause issues that are at the core of the problem. Often when an event occurs, multiple monitoring systems are generating alert storms and as a result, users are also opening up tickets that are related and subsequently can be triaged and tracked as one event.
  4. Intelligent alerting and escalation: After root-cause alerts and issues are identified, ITOps teams are using artificial intelligence to automatically notify subject matter experts or teams of incidents for faster remediation. Artificial intelligence can act like a routing system, immediately setting the remediation workflow in motion before a human being ever gets involved.
  5. Incident auto-remediation: AIOps is also being used as an end-to-end bridge between ITSM and IT operations. Traditionally, ITSM teams sift through infrastructure data to identify and remediate issues at the root cause. AIOps extracts root cause inferences from infrastructure alerts and sends them to an ITSM team or tool through API integration pathways.
  6. Capacity optimization: This can also include predictive capacity planning and refers to the use of statistical analysis or AI-based analytics to optimize application availability and workloads across infrastructure. These analytics can proactively monitor raw utilization, bandwidth, CPU, memory and more, and help increase overall application up time.

AIOps — The Path Forward

Image Courtesy of Pixabay

“The journey of a thousand miles begins with one step”. Lao Tzu

References

  1. https://www2.everestgrp.com/reportaction/EGR-2018-29-V-2747/Marketing
  2. https://hbr.org/2014/05/making-freemium-work
  3. https://knowledge.wharton.upenn.edu/article/amazons-1-click-goes-off-patent/
  4. https://landing.google.com/sre/
  5. https://www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops/

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store