Combining Data Governance and Data Privacy with the IBM Knowledge Catalog Data Privacy Accelerator

Pat O'Sullivan
5 min readJul 17, 2023
Photo by Towfiqu barbhuiya on Unsplash

Authors : Pat O'Sullivan Paul Kilroy Julie Forgo

For most organizations looking to enhance the management and governance of their data, there is a critical need to ensure that such initiatives also contribute towards addressing their Data Privacy obligations. This is an increasing challenge given the array of new data privacy legislation in more jurisdictions across the globe.

Data Privacy is just one of many considerations when implementing a comprehensive data governance solution. Typically, an organization will have other requirements relating to business-compliance needs, such as governance for financial reporting, capital adequacy, fraud detection, or security, to name a few. To support the governance goals of complex organizations, IBM offers the Data Privacy Accelerator to help you combine data governance and data privacy into an integrated solution, customized for your particular needs.

The Data Privacy Accelerator, introduced as part of IBM Knowledge Catalog 4.7, provides a curated selection of pre-defined constructs specifically designed to assist organisations who deploy IBM Knowledge Catalog to meet their data privacy requirements. Design of the accelerator was based on feedback from earlier client deployments for addressing data privacy needs, as well as by insights garnered from IBM’s own internal data privacy initiatives. Your organization can benefit from this real-world experience by adapting this package of assets for your use cases.

A data privacy-specific taxonomy

The core of this accelerator is a set of 500–600 business terms (depending on industry) that specifically cover all of the main business items relevant to Data Privacy. These business terms are categorized in a specific data privacy taxonomy, so that users can view at a glance which business terms pertain to various areas of Data Privacy such as Financial Information, Health & Biometric, Government IDs, and so on.

The sets of industry-specific business terms are integrated into the overall core vocabulary of the IBM Knowledge Accelerators. This means that the Data Privacy taxonomy of terms can be used on its own or can be used as one of many views of the central vocabulary of the enterprise. Thus, Data Privacy does not operate in isolation but can be managed as part of a range of other business initiatives. The following images show categorized views of the taxonomy.

Additionally, all the business terms and other artifacts in the Data Privacy accelerator have already been assigned very specific PI (Personal Information) and SPI (Sensitive Personal Information) classifications. These classifications have been extensively reviewed to ensure that they align with recommended classification in accordance with major data privacy regulations such as GDPR (EU General Data Protection Regulation) and CCPA (California Consumer Privacy Act).

A data privacy-oriented hierarchy of Data Classes

A critical function of any data governance solution in the area of data privacy is to ensure that all of the data across the organization is analysed, classified, and assigned to the appropriate business terms. It can be a daunting task to execute the data discovery and enrichment process needed to define and maintain an extensive set of well-categorized data classes.

The Data Privacy Accelerator streamlines the process by providing you with a pre-defined set of data classes, including new classes specifically tailored for data privacy and a catalog hierarchy for organizing the classes. You can leverage this work by choosing to apply or ignore data classes when you import and enrich the technical metadata for your data assets. This gives you a high degree of control over how your data is processed and classified with minimal set up, allowing you to focus on applying data privacy without first building the infrastructure. The following image shows a typical data class you can include or exclude for your use case.

A set of sample Data Privacy Policies and Rules.

Within an organization, the various data protection regulations are typically enforced via a set of policies and associated rules. Given the various regulations across different jurisdictions, how they are enacted by organizations, and the different business needs to be addressed, it is not possible to define an exhaustive set of policies and rules. However the Data Privacy Accelerator includes a set of sample policies and rules that show how such artifacts can be defined and how they relate with the other data governance artifacts. You can use the samples as both a guide and a shortcut for creating policies and rules that meet the needs of your organization. The following image shows a typical policy for retention of personal information.

These sample policies and associated rules are grouped in what are typically the main functional areas to be addressed as part of any Data Privacy program.

A specific Data-Privacy oriented dashboard

Finally, you can download a Data Privacy dashboard for use within IBM Knowledge Catalog to get an across-the-board view of data privacy activities.

The dashboard combines IBM Knowledge Catalog Reporting Database with a set of sample dashboard reports and related SQL queries. The dashboard views are constructed to give data stewards and admin personnel the means to review key information regarding the data privacy aspects of their data assets. For example, a data steward can easily view which assets are classified as PI or SPI, which assets have associated rules, and so on.

You can adopt and tailor this sample IBM Dashboard for the needs of your organization, or you can use it as a guide for creating a dashboard using other business intelligence reporting tools.

For more details on the steps needed to create such a dashboard see this separate article : medium.com/@paul.kilroy/building-a-data-privacy-dashboard-using-ibm-knowledge-catalog-325b18bb103d

Summary

The Data Privacy Accelerator offers clients of IBM Knowledge Catalog the ability to immediately deploy a set of governance artifacts to kickstart their data governance activities in the area of data privacy. Customers can deploy the Data Privacy Accelerator as a standalone component, or use it in conjunction with other predefined content from other IBM Knowledge Accelerators.

--

--

Pat O'Sullivan

Senior Technical Staff Member with IBM. A Data Architect with a background in Data Models, Business Glossaries, Data Governance and Data Management.