Data Governance: Exploring the Paradigm with Watson Knowledge Catalog — Chapter 2
Chapter 2: Business Terms and Classifications
[co-authored by Praveen Devarao and Arnajdas]
In previous chapter we learnt about data governance and how data catalogs play a pivotal role in having an effective data governance framework.
In this chapter we will learn and explore, using the capabilities of Watson Knowledge Catalog, two of the key data governance constructs: Business terms and Classifications.
Business Terms
Business terms help standardize business definitions in the organization. You can look at business terms as a dictionary which users within the organization can refer to and contribute to in order for everyone to have a common understanding of the term.
While a dictionary is a simple list of terms and their definitions, business terms not only contain definitions, but can also be organized in a manner to contain information about how they are related with other terms defined within the organization.
Business terms are typically used to describe contents within a data artifact. These are associated with data assets so that it’s easy for non-technical users to easily find for relevant data within the organization in the context of the business term.
To create a Business term in Watson Knowledge Catalog, access the Business terms menu from the left side of the landing page of the Cloud Pak for Data platform. The image below shows the Cloud Pak for Data dashboard with expanded Navigation Menu, detailing the landing page of Cloud Pak for Data with the Business terms menu under the Governance section highlighted.
On entering the Business terms page you will see a list of published and draft terms defined within the system. Below image shows the landing page of Business terms with a list of terms for a fictitious travel industry based organization containing terms like Trade, Shipping, Logistics etc.
To create a business term, click Add Business Term -> New Business Term. Key-in the name of your business term, choose the category in which the term is to be created. Optionally you can add abbreviations and a description and you are also able to save the term as a draft.
Extras: Category, loosely put, can be looked at as an operating system folder under which you can organize different governance artifacts of WKC. Along with organizing artifacts, one can provide permissions to a user or group of users on the category which will implicitly apply to all artifacts within it. If interested in learning more, read this post on using categories to manage governance artifacts.
The below image shows the newly created term in a draft state. The overview tab shows the description and a Related terms section captures the relationship information between this and other terms defined in the system.
Every governance artifact created in Watson Knowledge Catalog goes through a workflow process wherein a user is able to perform multiple edits of the draft copy, then move it through a thorough review process before publishing and finally make it available for consumption in the platform. Refer to post Governance Workflows — The Key to Building Successful Information Architectures by Namit Kabra for understanding and trying out the workflow capability of Watson Knowledge Catalog.
To move the term to review state, click on Send For Approval. The person assigned to the reviewer role can approve or reject the draft. After approval you can publish the term. The below image shows the term in a draft state and the exercising of the Publish from the workflow action button.
With this, the term is active and available for assignment to different data assets in the catalog. As a data steward assign the newly created term to assets within a catalog and try searching for assets with term name as the keyword. You should find at this point that all the assets that have been associated with the term show up in the search results.
The below image shows the search results of all assets in catalogs associated with business term keyed-in as the search word in the global search bar.
Classifications
Classifications are a special type of label used to indicate unique attributes or properties of an asset within the data catalog. You can look at them as tags within the system. While tags are free flowing and a user can define them as appropriate for the individual, classifications are globally defined and are scrutinized before application on the system.
This type of scrutinized tags, i.e. classifications, can serve in achieving consistent and reliable self-service governance in catalogs via being defined as a criteria determining factor of a Data Protection Rule. Also, one can search for different assets defined within the system based on the classification associated with it, similar to search via the business terms.
These classifications can also be used to describe other governance artifacts. This provides a method for the grouping of data assets and governance artifacts.
To create a classification in Watson Knowledge Catalog, click on Classifications from the Navigation Menu in the Cloud Pak for Data homepage. In the image , Cloud Pak for Data dashboard with expanded Navigation Menu, below you can see the Cloud Pak for Data home page and the Classifications menu entry point.
The classifications landing page will have a list of published and draft classifications defined within the system as shown in image below.
To create a new classification, click on Add classification -> New classification. Key in the name of the classification with an optional description. Select a primary category and click on Save as draft
Edit the draft to assign stewards, change the description and to relate other classifications defined within the system. The below image shows the new classification drafted and ready to be published for consumption in the platform.
As done while exploring business terms in previous section, move the classification through the workflow steps and publish the classification.
Associate this classification with a few assets in your catalog and try searching by keying the classification name as the search string. The search should show all the assets associated with the new classification.
You can further play around with defining a Data Protection Rule as illustrated in the post Data Protection Rule in Watson Knowledge Catalog with classification as one of the conditions based on which access to a data set for a particular user is determined.
Conclusion:
In this chapter we learnt what Business Terms and Classifications are in a data governance framework. We learnt how to define these constructs in Watson Knowledge Catalog and see it in action for searching assets or defining Data Protection Rules.
In the next chapter we will learn what Data Classes are and explore them in the Watson Knowledge Catalog platform.