Transform your business with Cloud Pak for Data v3.5

Published in

Cloud Pak for Data

8 min readOct 19, 2020

2020 has been a transformational year of epic proportions . Enterprises are either thriving with exponential growth or in survival mode adapting to the new normal. We at IBM firmly believe that to survive, adapt and succeed , enterprises ought to innovate, accelerate transformation and infuse automation into their business processes.The latest version of Cloud Pak for Data “version 3.5” is packed with capabilities to drive business transformation whether it is cost savings from automation and tool consolidation or through innovation at scale.

While Cloud Pak for Data is an end-to-end Data & AI platform that is tightly integrated and can run on any public or private cloud, it is also modular and composable allowing enterprises to embrace just the capabilities that they need. The fact that it includes a comprehensive set of IBM , Open source and 3rd party services makes it truly an “open platform”.

The enhancements for 3.5 can be broadly grouped into 2 key themes :(1) Cost Reduction and (2) Innovation to drive digital transformation. Customers can drive down costs through automation, consolidated management and an integrated platform all of which have received significant enhancements in version 3.5. On the innovation front, Accelerated AI, Improved governance & security and Expanded ecosystem have been the key focus areas. We’ll cover each of these themes in detail below :

1a . Enhanced Automation

Cloud Pak for Data Operator simplifies installations and lifecycle management such as automated provisioning, configuration management, seamless patching, upgrade and scaling with operational knowledge baked in, along with auto-pilot management.
Automated data discovery & Quick Scan Improvements : The Quick Scan capability is upgraded to perform more scalable data discovery with richer analysis results that can be published to one or more catalogs directly from the results page. Also sample size for auto discovery is set to 1,000 records by default to optimize performance — this can be updated if needed.
Enhancements to governance workflows : You can use workflows now to manage and automate your business processes in Cloud Pak for Data. For example, when you install Watson™ Knowledge Catalog, the service includes predefined workflow templates that you can use to control the process of creating, updating, and deleting of governance artifacts. From the Workflow management page, you can to define and configure the types of workflows that you need to support your business processes.You can import and configure BPMN files from Flowable.

1b. Simplified Administration

Platform management with advanced resource quotas to manage and monitor your Cloud Pak for Data deployment.The Platform management page gives you a quick overview of the services, service instances, environments, and pods running in your Cloud Pak for Data deployment. The Platform management page also shows you if there are any unhealthy or pending pods. If you see an issue, you can use the cards on this page to drill down to get more information about the problem. In addition, you can see your current vCPU and memory use. You can optionally set quotas to help you track your actual use against your target use & enforce these quotas to prevent Cloud Pak for Data or individual services from using too many cluster resources.
Manage production workloads with deployment spaces: The Deployment spaces page gives you a dashboard from which you can monitor and manage production workloads in multiple deployment spaces. This page makes it easier for Operations Engineers to manage jobs and online deployments, regardless of where they are running. The dashboard helps you assess the status of workloads, identify issues, and manage workloads. You can use this page to Compare jobs, Identify issues as they surface and Accelerate problem resolution.
New CLI commands to automate and manage service instances , backup and restore the project where Cloud Pak for Data is deployed and import /export platform meta data
New APIs to enable programatic access to a number of platform assets including but not limited to virtualized data sets from DV service.

1c. Improved User Experience

Guided walk-me flows & product tours to accelerate user on-boarding and drive self service usage
Navigation improvements : The Cloud Pak for Data navigation menu has been updated to make it easier to find the features that you need. The new navigation focuses on the objects that you need to access, such as Projects, Catalogs, Data, Services, Your task inbox etc.
Personalization : Also we now have two different ways to customize the home page:Platform-level customization where-in an administrator can specify which cards and links are displayed on the home page. The changes apply to all users as long as they they have relevant access permissions You can also customize the resource links that are displayed in the Resources section of the home page.In addition to platform level customization, each user can further customize the cards & Quick navigation links that are displayed on their home page.
New “Platform connections” capability makes it easier for administrators to define and manage connections and for users to find connections. The Connections page is a catalog of connections that can be used by various services across the platform. Any user who has access to the platform can see the connections on this page. However, only users with the credentials for the underlying data source can use a connection. Users who have the Admin role on the connections catalog can create and manage these connections. Unlike previous releases of Cloud Pak for Data, services can refer to these connections, rather than creating local copies. This means that any changes you make on the Connections page are automatically cascaded to the services that use the connection.
New “Data Management Console” helps administer, monitor, manage, and optimize your databases from a single web-based console. It is a browser-based tool that lets you manage and monitor all of your IBM Db2 databases, including Db2 Big SQL and Data Virtualization, from a single user interface. The console helps you improve your productivity with a simplified process of managing and maintaining your complex database ecosystem.The console home page provides an overview of all of the databases that you are monitoring including database connection status and monitoring metrics that you can use to analyze and improve the performance of your databases.

2a. Accelerated AI

Watson Machine Learning Accelerator (WML-A) is a new deep learning service capability for data scientists for optimizing training models and monitoring deep learning workloads.It enables you to iterate through the training cycle on more data to continuously improve the model over time. WML-A provides many optimizations that accelerate performance, improve resource utilization, and reduce installation, configuration, and GPU management complexities.
Federated learning (Tech Preview) to train a common model using remote, secure data sets. The data sets are not shared so full data security is maintained, while the resulting model gets the benefit of the expanded training.
“Auto AI” — now supports multiple data stores as input for training experiments. Data scientists can use the data join canvas to combine data sets based on common columns, or keys, to build a unified data set which can then be used as input for the Auto AI experiments.

2b.Improved governance & security

Enhanced Data Privacy capabilities:
User groups for more granular access to assets: A Cloud Pak for Data administrator can create user groups to make it easier to manage large numbers of users who need similar permissions. When you create a user group, you specify the roles that all of the members of the group have.One can also configure a connection to an LDAP server through which user groups can include Existing platform users, LDAP users and LDAP groups. You can assign a user group access to various assets on the platform in the same way that you assign an individual user access. The benefit of a group is that it is easier to: (i)Give a large number of users access to an asset and (ii) Remove a user’s access to assets by removing them from the user group
New Security vault to store platform secrets: Cloud Pak for Data introduces a new set of APIs that you can use to protect access to sensitive data. You can create a vault that you can use to store: Tokens, Database credentials, API keys, Passwords and Certificates

2c. Expanded ecosystem

3 new IBM cartridges : OpenPages, Open Data for Industries, Knowledge accelerators
31+ Industry accelerators & 50+ ISV Services making Cloud Pak for Data an Open platform — 9 new ISVs are added as part of version 3.5. Notable among these are Enterprise DB PostresDB, Hazelcast and Trilio
Also, Cloud Pak for Data is now available on IBM & Tech Data marketplaces and supports Z-Linux

Other Notable enhancements

Connect to Storage Volumes: In Cloud Pak for Data Version 3.5.0, you can connect to storage volumes from the Connections page or from services that support storage volume connections. The storage volumes can be on external NFS storage or persistent volume claims (PVCs). This feature enables you to access the files that are stored in these volumes from Jupyter Notebooks, Spark jobs, projects, and so on.You can also create and manage volumes from the Storage volumes page.
Idle web session timeout: A Cloud Pak for Data administrator can configure the idle web session timeout in accordance with your security and compliance requirements. If a user leaves their session idle in a web browser for the specified length of time, the user is automatically logged out of the web client.
Auditing assets with IBM Guardium: The method for integrating with IBM Guardium has changed. IBM Guardium is no longer available as an option from the Connections page. Instead, you can connect to your IBM Guardium appliances from the Platform configuration page.
Common core services: are installed once and used by multiple services. The common core services include Connections, Workflows, Notifications, Search, Projects and Metadata repositories. The common core services are automatically installed by services that rely on them.
Analytics Engine powered by Apache Spark: Spark 3.0Analytics Engine Powered by Apache Spark now supports Spark 3.0. You can select (i) The Spark 3.0 template to run Spark jobs or applications that run on your Cloud Pak for Data cluster by using the Spark jobs REST APIs. (ii) A Spark 3 environment to run analytical assets in Watson Studio analytics projects.

To re-iterate Cloud Pak for Data 3.5 comes packaged with significant improvements and packs a lot of net new features to address the evolving needs of our clients. As we adjust to the new market reality, enterprises taking the initiative and leveraging this opportunity to streamline, consolidate and transform their architecture will come out ahead both in sustaining the short term impact through immediate cost savings and in modernizing for an evolving and agile future. Cloud Pak for Data enables just that. IBM is willing to co-invest resources to accelerate and help you with this journey. If you are interested in starting your journey with Cloud Pak for Data , please schedule a consultation with one of our experts or join us for one of the upcoming webinars.

Transform your business with Cloud Pak for Data v3.5

Written by Hemanth Manda