anvita vyas
4 min readJul 21, 2022

Pause and Resume Reporting in Cloud Pak For data

How to take control on reporting to pause and resume the WKC data synchronization into the reporting data mart ?

In previous post on reporting we learnt the capability of reporting on governance data in Watson Knowledge Catalog [WKC] and the need of reporting in data governance. We have also learnt to create non-vaulted connection using db2 cloud instance and establish reporting on external db2 data mart. In this post, we will learn about the new capabilities introduced in reporting to make it more efficient and secure. Also, we will learn to create platform connection using postgresSQL.

While synchronizing the governance data into the data mart an user might want to stop data synchronization for some maintenance activities in Watson Knowledge Catalog [WKC]. In this post, we will learn to configure reporting using PostgreSQL and pause and resume the flow of WKC data into data mart.

Configuring reporting using PostgreSQL platform connection

  • Provision a PostgreSQL instance

Reporting service has newly introduced support to store WKC governance data in PostgreSQL version 12 or later. For creating Platform Connection to PostgreSQL, either you can install PostgreSQL in virtual machine or you can use PostgreSQL cloud instance from IBM cloud. Ensure PostgreSQL instance is accessible to/from your IBM Cloud Pak For Data .

To install PostgreSQL 12 in using docker containers in your virtual machine, run following command:
1. docker pull postgres:12
2. docker run -itd — name postgres12 -p 5432:5432 -e POSTGRES_PASSWORD=pgsql -d postgres:12

In your IBM cloud account create Hyper Protect DBaaS for PostgreSQL instance. we will select PostgreSQL version 13 from drop down menu to create PostgresSQL instance.

Hyper Protect DBaaS for PostgreSQL cloud instance creation
  • Create platform connection using PostgreSQL

Navigate to the Platform Connections tab on the navigation panel of Cloud Pak for Data dashboard and click on New Connection. On the next page select the type of Database. We select PostgreSQL in this case

Select PostgresSQL to create platform connection

In public cloud we support non-vaulted platform connections. You can create platform connection using vault secret in Cloud Pak For Data v4.5. There is a separate blog for creating vaulted connection.

Create PostgresSQL connection using vault secret
  • Configure WKC to write governance data into data mart

From IBM Cloud Pak For Data dashboard, access the Catalogs page followed by Reports Setup tab.
Below image shows the entry point on IBM Cloud Pak For Data home page and Reports setup page

Configure reporting using PostgreSQL

Read same section Configure WKC to write data into Data Mart in previous post to establish reporting into data mart.

After establishing the reporting, WKC data will automatically synchronized to data mart.Any change in the catalog, project, category, or data protection rule that is enabled for reporting is reflected on the database.

  • Pause the data flow from WKC to data mart

As there is an automatic synchronization established between Watson Knowledge Catalog and data mart, user may require to stop synchronization for some short period of time to perform maintenance activities. User can pause data flow to data mart by clicking Pause reporting.

Pause reporting to stop data synchronization

If user creates any WKC data when reporting is paused, data will be getting store in cache.

  • Resume the data flow

User can resume the data synchronization after all maintenance activities performed. All cached data created during maintenance will be consumed by reporting service. Reporting will resume by clicking Resume reporting.

Resume reporting to start data synchronization

In this way, user can control the data synchronization into the data mart. To know more about reporting you can read product documentation.

Conclusion

In this post, We learnt about to configure reporting using PostgreSQL data mart. Also, We learnt how to pause automatic data synchronization into the data for some short period of time and resume again.