New release of Watson Knowledge Catalog on Cloud Pak for Data

Leslie Fowler
IBM Data Science in Practice
3 min readJun 26, 2020

--

On June 19, 2020, IBM released Cloud Pak for Data v3, which includes new capabilities for Watson Knowledge Catalog.

For those that aren’t familiar with Cloud Pak for Data (CPD), this is our fully integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. This platform is cloud-native by design, it unifies market-leading services spanning the entire analytics lifecycle from data management to data integration and governance, to analytics and advanced AI. IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement analytics and AI successfully.

IBM Cloud Pak for Data

Watson Knowledge Catalog

Watson Knowledge Catalog (WKC) is part of the Cloud Pak for Data platform and provides the intelligent, self-service discovery of data, models, and more. This enables businesses to access, curate, assess quality, categorize and share data, knowledge assets and their relationships wherever they reside.

WKC combines our InfoSphere Information Governance Catalog and InfoSphere Information Analyzer (part of InfoSphere Information Server) into a single application to deliver upon our Governance, Quality, Catalog capabilities on a new modernized platform — that of Cloud Pak.

Key capabilities of WKC include:
- Real-time data virtualization support
- Automated data discovery and metadata generation
- ML-extracted business glossary from most common regulatory terms
- Dynamic data masking to protect sensitive data
- Automated scanning and risk assessments of unstructured data

Watson Knowledge Catalog

The below videos demonstrate some of the capabilities that are available with WKC in this latest release.

The Catalog
Data Refining
Data Virtualization withe the Glossary
Data Virtualization and Data Protection
What’s new in CPD v3

There are many new features available in WKC with this new release. Here are the key highlights:

Improvements have been made on the automatic term assignment during the curation process. When you reject terms that the machine learning service provides, it learns from your actions, and generates better results in future.

Data analysis during advanced data curation has been enhanced
- Filtering capabilities provide improved speed in the relationship analysis and overlap analysis
- Displays trend of the quality score over time for an asset. Also shows how the number of violations changed between the last two analyses

Automatic data class creation using clustering technology.

Improvements to the governance workflow capabilities.
- Activity log for full history of governance artifacts (terms, policies, etc.)
- Ability to send email for different workflow steps

Catalog improvements include
- Teradata and File assets are now synced to the default catalog
- New connectors for Impala and Planning Analytics

Data protection rules (to mask, redact PII data to business users) has been improved to include classifications in criteria when defining data protection rules. For more information on the data protection policies, see here.

Risk assessment of unstructured data via Instascan to find PII data in unstructured data types. Check it out here.

Recent Webinar and Demo

Webinar on Watson Knowledge Catalog in CPD 3

Free trial of CPD and WKC

http://ibm.biz/cloud-pak-for-data

Connect

Join the IBM DataOps Community Page to connect with experts and peers.

Leslie Fowler
fowlerl@us.ibm.com

--

--