Cloud Pak for Data v4.6

Sachin Prasad
Cloud Pak for Data
Published in
7 min readDec 1, 2022

Holidays are right around the corner and most of us working hard to start the season on a good note and end the year with accomplishments.

I am proud to announce that today we reached another milestone in Cloud Pak for Data journey — today v4.6 is generally available. This is our second feature release in 2022, only 5 months after the release of 4.5. Churning out a high quality feature releases in such short span is no game and this was made possible through the dedication, planning, and hard work of the entire team.

Cloud Pak for Data enables organizations to get the most out of their data — encompassing the capabilities of 40+ IBM and partner services that are continually expanding.

As our product has matured, our focus has shifted from not only building new features but to also doubling down on what makes us unique in the market — enterprise readiness, robustness and resiliency — making Cloud Pak for Data the best choice for enterprise production workloads. Version 4.6 expands and solidifies upon the capabilities of 4.5, and also introduces some new features and services. As a reminder, this highly anticipated release, as well as all subsequent releases, follows the Cloud Pak for Data lifecycle, which we modeled to fulfill a wide range of customer needs.

Below is a summary of what to expect from the Cloud Pak for Data 4.6 platform.

Support for NetApp Storage

Cloud Pak for Data is now certified to support NetApp ONTAP v22.4.0 or later via Netapp Trident CSI drivers. NetApp’s best-of-breed on-premises and cloud-native data management technologies will ensure robust datastore availability across the board for Cloud Pak for Data workloads. Snapshot-based online backups and restore on same cluster are supported in order to provide uninterrupted operation.

Cloud Pak for Data with Spectrum Fusion

Cloud Pak for Data is now fully integrated with IBM Spectrum Fusion and IBM Spectrum Fusion HCI, which enables running data and AI workloads including online backups and restore to same or a different clusters. Although Disaster Recovery and Data Protection for Cloud Pak for Data via Spectrum Fusion was introduced in 4.5.3, we made additional improvements in 4.6 making it more reliable and seamless.

Compliance Updates

In order to keep providing its services to government customers and expand further into federal markets, Cloud Pak for Data has taken the effort to improve & verify its ops on a CIS (Center of Internet Security) hardened Openshift.

For the uninitiated, CIS provides benchmark and guideline for set of vendor-agnostic, internationally recognized secure configuration guidelines for various platforms such as Operating systems, Cloud infrastructure, Servers software etc. CIS Kubernetes & OpenShift hardening guidelines are quickly becoming industry standards for containerized workloads and Cloud Pak for Data coming out clean on CIS benchmark is going to be a big deal for our security savvy customers.

Monitoring Features

In our effort to simplify day 2 operations for Cloud Pak for Data admins, the product will introduce new “Alerts” cards on the homepage which will link to the “Events and alerts” page within Monitoring. This will display critical and warning alerts on the homepage itself. The “Events and alerts” page has also been redesigned to include an interactive bubble graph. These changes were intended to make monitoring more discoverable for administrators.

In addition to this, the new “Alert Forwarding” tab in the Configurations page will allow administrators to set up email, SNMP, or Slack notifications to users and can be easily set up with notifications that are pushed to the admin. Our customers have expressed that such configurations were hidden and hence missed. Cards on the main page would help alleviate this problem and provide a better user experience for admins.

Finally, admins also have the ability to check the status of their users (online/offline) in Access Control page, as well as the timings and duration of a user’s current session or previous session with an option to evict and disable suspicious users. This update complies to one of the STIG compliance requirements to to identify suspicious access and take action immediately.

Pod Specification Overrides

Cloud Pak for Data now provides a mechanism to override certain aspects of Pod Definitions that was previously not possible due to Kubernetes Operator reconciliation. This works by leveraging admission controller to apply patches directly to pods that are associated with Kubernetes deployments, statefulsets, replicasets, replicacontrollers, jobs, and cronjobs. This will provide for certain use cases: Add environment variables to pod containers (eg for HTTP Proxy), add labels to pods, add annotations to pods, add node affinity rules to isolate specific workloads, and customize resource requests and limits.

Data Fabric & Connectivity Updates

The core of the data fabric architecture is a data management
platform that enables the full breadth of integrated data
management capabilities including discovery, governance,
curation, and orchestration. Cloud Pak for Data sits at the heart of the IBM Data Fabric strategy with common connectivity being the key pillar to this fabric.

Given the importance, Common connectivity features are delivered throughout the delivery cycle of Cloud Pak for Data. This includes new connectors, enhancements to existing connectors, and support for connectors. During last few months we delivered new connectors for Kafka streaming platform, Dremio Data Lake Platform & singlestoreDB. We also extended our support of vaults and kerberose support to additional existing connectors.

Deployment Improvements

4.6 introduces day 0/1 operation improvements for both initial deployments and day to day operations, including Single Arch (chipsets such as x86) container images for air gapped deployments. Previously, Fat multi-arch images were bulky and not required in most cases but still customers ended up downloading them. The new separation will help reduce overall download and install time drastically. Furthermore, we’ve made various cpd-cli enhancements to ease deployment and configurations for CPD administrators.

Elasticity & Scalability

Additional services now supporting Horizontal Pod AutoScaler (HPA) and Service Shutdown and Restart (SSR) beyond the limited core services that were supported in last release. This is an evolving area and we are invested in getting our customer the most flexible product possible which should scale up and down (and even shutdown) based on workloads and budgets.

Multi-cloud updates

Cloud Pak for Data strives to support deployments anywhere — on any cloud or on-prem — as long as the prerequisites are met such as RHOCP, storage, etc. Our multi-cloud engineering has been relentlessly working on refreshing the offering on all supported hyperscalers and marketplace listings. They were recently updated to our last stable release of 4.5.3 (4.6 is on the roadmap)

AWS:
Self Managed : BYOL ,Cartridges — (6 Tiles)
ROSA Listings for DF :
Multicloud Data Integration with IBM Data Fabric
Data Governance and privacy with IBM Data Fabric
Data Science and MLOps with IBM Data Fabric

Azure:
Self Managed : BYOL, Cartridges (6 tiles)
ARO : BYOL

IBM Cloud:
Cloud Pak for Data Catalog
Satellite on Prem — Live
Satellite AWS — Live

Base Service and Cartridge Updates

Following new services are now added to the platform :

  • Watson Studio Pipelines are built off of Kubeflow pipelines on Tekton runtime and are fully integrated into the Watson Studio platform, allowing users to create repeatable and scheduled flows that automate notebook, data refinery, and machine learning pipelines.
  • Data Replication allows replication of data changes across heterogeneous data stores without impacting the performance of your systems of record.
  • Bring AI Factsheets as part of your AI governance strategy to track the lifecycles of machine learning models from training to production.

To me, this was a great year with two solid releases bringing tons of new features across our 40+ services and providing a solid foundational platform to run some serious production workloads. Its my sincere hope that our clients would appreciate the work and this new release would help them grow their businesses.

Finally, thanks to all my counterpart product managers across services to make this release a reality. Super proud of what we accomplished as a team and we are excited about the value Cloud Pak for Data will bring to our clients.

Happy Holidays !!

*************************************************************

Get Started

Expert Labs Services
Ready to modernize? Need help accelerating Cloud Pak for Data implementation or upgrades? Connect with the Expert Labs team.

Dive Deeper
To learn more about the IBM Cloud Pak for Data 4.6 release:

Publications & Books on Data and AI concepts

--

--

Sachin Prasad
Cloud Pak for Data

Sachin’s day job includes helping customers build smart apps infused with AI to solve complex problems in a more sustainable way.