Similarities and differences between Cloud Pak for AIOps and AIOps SaaS

Ricardo Olivieri
IBM Cloud Pak for AIOps
6 min readSep 8, 2023

Authors: Ricardo Olivieri, Isabell Sippli, Paul Watkins

IBM has released two major milestones in our AIOps Platform portfolio — Cloud Pak for AIOps v4.1 and AIOps Insights (a SaaS offering). Our AIOps Platform solutions both address the following challenges in Ops:

  • Resolving incidents takes too much time, because resolution relevant information is spread across various siloed tools, like Application Performance Management solutions, Network Performance Monitoring Tools, Infrastructure monitoring Tools, Security Information & Event Management Tools, etc.
  • Such information, e.g. events, environment changes, KPIs, logs is not correlated into a single record, and even if it is correlated, there is no indication what the probable cause could be.
  • Even if the probable cause of an incident is found, it is not clear how to fix it.
  • There is no single pane of glass to investigate status and health of a managed environment, across silos.

While our two solutions CP4AIOps 4.1 and AIOps Insights address these challenges, there are similarities and differences, and this blog post will detail those. We aim to provide clarity on which one of these two solutions an organization should choose as part of their AIOps adoption journey.

Before we describe the differences with CP4AIOps 4.1 and AIOps Insights, we’d like to outline first the most relevant similarities that exist among the options. They are both Incident Management solutions and, as such, they both address the use cases of data aggregation and noise reduction along with triaging and remediation of incidents.

In particular, the two solutions focus on the following capabilities:

  • Aggregate and normalize IT operational data from disparate tools
  • Visualize real-time IT environments (i.e., visualization of application and supporting infrastructure topology)
  • Use artificial intelligence to correlate data into incidents (through event correlation and compression)
  • Quickly identify probable cause of an incident
  • Augmented incident remediation by applying AI/ML

Let’s now go over the main approaches across these two AIOps Platform solutions for Incident Management.

Data residency (data localization)

CP4AIOps is a software product that can be deployed to any Red Hat OpenShift environment that meets the required SW and HW requirements. Therefore, you can deploy CP4AIOps to an OpenShift cluster that runs privately within an organization’s on-premises environment or to a remote OpenShift cluster on a cloud provider such as IBM Cloud, Azure, GCP, or AWS. For IT organizations that prefer their operational data be kept within the confines of their on-premises environment or within their cloud provider of choice, CP4AIOps allows them to do just that. Some organizations have stringent security requirements and, consequently, they cannot send their operational data to another cloud provider’s environment or have their data leave their own on-premises data centers.

AIOps Insights is a SaaS offering that runs on AWS and provides off-the-shelf integrations to cloud providers, Application Performance Monitoring tools, and containerization platforms. This implies that operational data from these data sources needs to leave those confinements for it to be ingested into AIOps Insights. An IT organization that is allowed to do so can then leverage AIOps Insights.

Extensibility and richness

AIOps Insights only ingests data through the off-the-shelf integrations it provides. At the time of writing, AIOps Insights provides off-the-shelf integrations for the following data sources:

Containerization platforms

  • Kubernetes
  • Red Hat OpenShift
  • Docker

Cloud providers

Application Performance Monitoring (APM) tools

  • Instana (in addition to infrastructure topology, it also gathers application topology)
  • AppDynamics
  • Dynatrace
  • Datadog
  • Zabbix
  • IBM APM
  • Splunk
  • New Relic

Network Performance Monitoring tools

  • SevOne

Wherever possible, AIOps Insights will ingest topology, metrics, and events together, and associate the individual data points with each other automatically, with no further configuration.

For details on the many out of the box (OOTB) integrations available in CP4AIOps, please see Defining connections and integrations. There you will find a long list of integrations for ingesting IT operational data from many data sources. Also, CP4AIOps supports ingesting data from all Netcool probes and ingesting event data through webhooks. If an OOTB dedicated connector does not exist for your event data source, you can define a webhook integration that consumes JSON event payloads.

Using OOTB connectors, CP4AIOps can:

  • Ingest application and system logs to identify anomalies. When anomalies are identified, these are raised as internal alerts that are correlated with the alerts coming from external data sources.
  • Ingest change requests records (from ServiceNow) to perform change risk assessment in order to identify change requests that are likely to be risky and could end up causing outages or problems in the IT environment.
  • Ingest historical incident records (from ServiceNow) to correlate previous incidents and known solutions to new incidents. These added insights empower IT operations staff to resolve problems quicker, significantly reducing the time to diagnose and remediate.
  • Notify IT operations staff when a new incident is created by sending a notification to a ChatOps tool (Slack or MS-Teams). The incident notification on the ChatOps user interface contains meaningful and descriptive information about the problem as well as relevant links to the corresponding views on the CP4AIOps console (as a side note, AIOps Insights supports sending only individual alerts to Slack). These notifications points IT operations teams straight to the cause of an incident, accelerating the process of identifying and resolving problems.

CP4AIOps also allows for the implementation of custom connectors in addition to its many OOTB connectors (e.g., Instana, Netcool, Dynatrace, etc.). The capability to implement custom integrations opens the door for ingesting data from the observability and monitoring tools of your choice into CP4AIOps.

Setup and configuration

As mentioned previously, CP4AIOps can be installed anywhere that Red Hat OpenShift runs, such as on a cloud provider’s infrastructure or on-premises. Hosting and monitoring an OpenShift cluster has its own set of requirements, which some IT organizations do not want to take upon themselves and/or don’t have the necessary skills to do so. For these organizations, a SaaS offering is more attractive since they can just focus on leveraging the AI and ML capabilities for IT operations without having to worry about, say, applying security patches, procuring more HW for storage, monitoring the health of the systems and infrastructure where the AIOps tool is deployed to, etc.

For defining integrations with AIOps Insights, users simply download code from the AIOps Insights console that you then deploy quickly to the target/data source environments. This approach simplifies the configuration effort since there are no forms to fill out with URLs, usernames, passwords, and custom configuration properties. Simplicity is pervasive in the way that integrations are created for AIOps Insights.

For defining integrations with CP4AIOps, users need to provide connectivity parameters (e.g., hostnames, URLs, IP addresses, authentication credentials, etc.). which makes the process a bit more involved. With that and as part of defining these integrations, CP4AIOps provides more granular control over the data to be ingested, hence, giving users more control and flexibility.

Data types

AIOps Insights consumes events, metrics, and topology data. By ingesting and augmenting this data with AI/ML, AIOps Insights can correlate alerts, reduce noise that distracts IT operations staff, and identify probable cause among other capabilities.

In addition to events, metrics, and topology data, CP4AIOps collects:

  • Application and system logs (for log anomaly detection).
  • Historical change request records to determine the potential risk of implementing new change requests in an IT environment.
  • Historical incident records in order to associate these with new incidents that are similar.

By tapping into these additional types of data and applying AI/ML algorithms to it, CP4AIOps can correlate these (and the added insights) with events, metrics, and topology.

Conclusion

We started this write-up by first describing the similarities that exist across CP4AIOps and AIOps Insights, given that both are Event and Incident Management solutions. We then proceeded to highlight some of the main approaches across both options. Such options can be used to decide what offering is a best fit for an IT organization. For instance, if an IT organization leverages any of the operational tools for which AIOps Insights provides OOTB integrations, then the effort for configuring such integrations and getting results is much less when using AIOps Insights. On the other hand, if an IT organization needs to ingest data from data sources that AIOps Insights does not support or if the IT organization needs more granular control over the operational data, and/or there is a need to keep all operational data within the confinements of an on-premises data center, then CP4AIOps is a better fit.

Acknowledgements

We’d like to thank Jacob Yackenovich and Jeremy Hughes for their invaluable review and feedback on this article.

--

--