Data Governance: Exploring the Paradigm with Watson Knowledge Catalog — Chapter 2

Praveen Devarao
IBM Data Science in Practice
6 min readApr 1, 2021

Chapter 2: Business Terms and Classifications

[co-authored by Praveen Devarao and Arnajdas]

a partial view of a mobile phone screen with the word “design” being defined in a dictionary app
Photo by Edho Pratama on Unsplash

In previous chapter we learnt about data governance and how data catalogs play a pivotal role in having an effective data governance framework.

In this chapter we will learn and explore, using the capabilities of Watson Knowledge Catalog, two of the key data governance constructs: Business terms and Classifications.

Business Terms

Business terms help standardize business definitions in the organization. You can look at business terms as a dictionary which users within the organization can refer to and contribute to in order for everyone to have a common understanding of the term.

While a dictionary is a simple list of terms and their definitions, business terms not only contain definitions, but can also be organized in a manner to contain information about how they are related with other terms defined within the organization.

Business terms are typically used to describe contents within a data artifact. These are associated with data assets so that it’s easy for non-technical users to easily find for relevant data within the organization in the context of the business term.

To create a Business term in Watson Knowledge Catalog, access the Business terms menu from the left side of the landing page of the Cloud Pak for Data platform. The image below shows the Cloud Pak for Data dashboard with expanded Navigation Menu, detailing the landing page of Cloud Pak for Data with the Business terms menu under the Governance section highlighted.

screenshot of the Cloud Pak for Data dashboard with a navigation menu open to the left with “Business Terms” under the “Governance” section highlighted
Cloud Pak for Data dashboard with expanded Navigation Menu

On entering the Business terms page you will see a list of published and draft terms defined within the system. Below image shows the landing page of Business terms with a list of terms for a fictitious travel industry based organization containing terms like Trade, Shipping, Logistics etc.

A screenshot of the Business Terms landing page with the terms “Trade”, “Shipping”, “Company”, “Transport”, “Logistics”, “Accounts”, “Management”, and “Travel” listed
Business Terms landing page

To create a business term, click Add Business Term -> New Business Term. Key-in the name of your business term, choose the category in which the term is to be created. Optionally you can add abbreviations and a description and you are also able to save the term as a draft.

Extras: Category, loosely put, can be looked at as an operating system folder under which you can organize different governance artifacts of WKC. Along with organizing artifacts, one can provide permissions to a user or group of users on the category which will implicitly apply to all artifacts within it. If interested in learning more, read this post on using categories to manage governance artifacts.

The below image shows the newly created term in a draft state. The overview tab shows the description and a Related terms section captures the relationship information between this and other terms defined in the system.

a screenshot of a draft state of the business term definition page for the term “Customer” showing the fields “Description”, “Type Relationships”, “Primary Category”, “Secondary Category”, “Synonyms”, “Part Relationships”, and “Other Related Business Terms”
Business Term details page

Every governance artifact created in Watson Knowledge Catalog goes through a workflow process wherein a user is able to perform multiple edits of the draft copy, then move it through a thorough review process before publishing and finally make it available for consumption in the platform. Refer to post Governance Workflows — The Key to Building Successful Information Architectures by Namit Kabra for understanding and trying out the workflow capability of Watson Knowledge Catalog.

To move the term to review state, click on Send For Approval. The person assigned to the reviewer role can approve or reject the draft. After approval you can publish the term. The below image shows the term in a draft state and the exercising of the Publish from the workflow action button.

Moving Business Term through workflow process for publish

With this, the term is active and available for assignment to different data assets in the catalog. As a data steward assign the newly created term to assets within a catalog and try searching for assets with term name as the keyword. You should find at this point that all the assets that have been associated with the term show up in the search results.

The below image shows the search results of all assets in catalogs associated with business term keyed-in as the search word in the global search bar.

screenshot showing search results for the business term “Customer”. It shows a list of results by Name with the type of result, the tags for the result, who has modified the result, and when it was modified on.
Search by term name results

Classifications

a set of file folders in a box or drawer with labels such as “Taxes”, “Receipts”, “Insurance”, and “Financing”
Photo by Sarah Pflug from Burst

Classifications are a special type of label used to indicate unique attributes or properties of an asset within the data catalog. You can look at them as tags within the system. While tags are free flowing and a user can define them as appropriate for the individual, classifications are globally defined and are scrutinized before application on the system.

This type of scrutinized tags, i.e. classifications, can serve in achieving consistent and reliable self-service governance in catalogs via being defined as a criteria determining factor of a Data Protection Rule. Also, one can search for different assets defined within the system based on the classification associated with it, similar to search via the business terms.

These classifications can also be used to describe other governance artifacts. This provides a method for the grouping of data assets and governance artifacts.

To create a classification in Watson Knowledge Catalog, click on Classifications from the Navigation Menu in the Cloud Pak for Data homepage. In the image , Cloud Pak for Data dashboard with expanded Navigation Menu, below you can see the Cloud Pak for Data home page and the Classifications menu entry point.

screenshot of the Cloud Pak for Data dashboard with a navigation menu open to the left with “Classification” under the “Governance” section highlighted
Cloud Pak for Data dashboard with expanded Navigation Menu

The classifications landing page will have a list of published and draft classifications defined within the system as shown in image below.

a screenshot of a list of Classifications within a knowledge catalog. In this list, there are classifications such as “Confidential”, “PII”, “SPI”, “Revenue”, “Intelligence report”, “Training Data”, and “Test Data” listed. These results can be sorted by filters such as Name.
List of published classifications

To create a new classification, click on Add classification -> New classification. Key in the name of the classification with an optional description. Select a primary category and click on Save as draft

Edit the draft to assign stewards, change the description and to relate other classifications defined within the system. The below image shows the new classification drafted and ready to be published for consumption in the platform.

a screenshot of a draft page for a new classification. Here, the new classification is “Employee data” with the field “Description” filled in with “Data of employees”, the field “Primary Category” filled in with “General”, and the other fields left blank.

As done while exploring business terms in previous section, move the classification through the workflow steps and publish the classification.

Associate this classification with a few assets in your catalog and try searching by keying the classification name as the search string. The search should show all the assets associated with the new classification.

screenshot of a list of results for searching for the classification “Employee data”, showing the data assets related to the classification.

You can further play around with defining a Data Protection Rule as illustrated in the post Data Protection Rule in Watson Knowledge Catalog with classification as one of the conditions based on which access to a data set for a particular user is determined.

Conclusion:

In this chapter we learnt what Business Terms and Classifications are in a data governance framework. We learnt how to define these constructs in Watson Knowledge Catalog and see it in action for searching assets or defining Data Protection Rules.

In the next chapter we will learn what Data Classes are and explore them in the Watson Knowledge Catalog platform.

--

--

Praveen Devarao
IBM Data Science in Practice

CMTS @ Oracle Cloud, previously Software Architect @ IBM India Software Labs