Use custom ML Model for automated Term Assignment on Cloud Pak for Data — Part III

Steven Huang
4 min readNov 20, 2023

--

AI and ML

The blog shows how to configure Metadata Enrichment in Watson Studio project to use the custom model which is created in Part II to assign terms to assets and columns.

Please note that a Cloud Pak for Data on-premises is required to complete tasks in this blog because only on-premises version has configuration entry for Metadata Enrichment to use custom model.

To verify the custom model, you must have a dedicated project because the configurations are different with any default ones.

From the Main navigation menu, select Projects > All projects. Click New Project on the new page, select Create an empty project, following the instructions on the screen.

Change the configurations for Metadata Enrichment

  • Open Manage tab page
  • Click Metadata enrichment on left side panel
  • Scroll the page down to the Term assignment methods to use section
  • Uncheck Machine learning, Data-class-based assignments, and Linguistic name matching,
  • Select Custom service

To configure custom service, click the button Select service. Use the information generated by the sample notebook to fill the pages.

Deployment space: Custom model space
Deployment: demo_tp_scoring_deployment
Input transformation code: {"input_data":[{"values":$append([ [$$.metadata.name, ""] ], $$.entity.data_asset.columns.[[$$.metadata.name, name]])}]}
Output transformation code: {"term_assignments": predictions[0].values ~> $map(function($x){function($z){$count($z) > 1? $z : [$z]}($x[0] ~> $zip($x[1]) ~> $map(function($y){{"term_id": $y[0], "confidence": $y[1]}})) })}

Click Next, review all input and click Select at the last step.

Due to the way in which confidence scores are computed, the values might be small compared to the results of the other term assignment methods. To increase the likelihood of term assignment and suggestions , appropriately lower the threshold under Term assignment on the configuration page.

Add testing data assets and run Metadata Enrichment asset

Add the data assets to this project which table names and column names are similar to the training data assets used in Part I.

Create a Metadata Enrichment asset

  1. In the project, click New asset > Select Metadata Enrichment tool
  2. On “Define details” page, assign a name, such as “Verify custom model”, then Next
  3. On “Data Scope” page, select all data assets in the project
  4. On “Enrichment objective” page, ensure “Assign terms” is checked, also check “Profile data” tile. Keep this page open
  5. Click “Selects categories”, select all categories which contains your business terms, such as “[uncategorized]”, “Industry Accelerators”(if has), etc. Please note that the scope of categories must include all business terms which are expected to be considered. Click Next
  6. Click Next, and Create button.

The new created Metadata Enrichment creates a new job under the hood, and starts the job automatically. Once the job completes, a notification will be popped out on right upper corner.

Based on the configuration of Metadate Enrichment, only custom model is used to assign terms. Check whether or not business terms are assigned to columns in Metadata Enrichment.

🎉Congratulations! you have successfully trained and deployed your custom model and use it assign business terms automatically with Metadata Enrichment!🎉

Conclusion and Summary

  1. In Cloud Pak for Data on-premises, configure Metadata Enrichment to use custom service (model)
  2. Run Metadata Enrichment with custom service to assign terms to verify the custom model

Find more information at

--

--