Run semantic expansion in IBM Knowledge Catalog

Rakhi Arora
11 min readJun 25, 2024

--

IBM Knowledge Catalog (IKC) service now provides semantic and AI-augmented data enrichment.

  • It recommends descriptive names for data assets and columns based on the collected metadata and a predefined glossary.
  • Suggests and assigns semantic descriptions for data assets and columns that are easy to understand. The descriptions are generated based on the surrounding columns and the context of the data assets.
  • Generate semantic term assignments for data assets and columns.

In this blog, we will learn how to run semantic expansion on a sample data asset.

Step 1 : Login to IBM Cloud Pak for Data

For this tutorial, we use IKC instance on IBM Cloud Pak fro Data.

1. Add an IBM Knowledge Catalog lite instance in your account:
— Login to IBM Cloud .
— Go to Navigation MenuResource listCreate resourceCatalog → Search for `IBM Knowledge Catalog`
— Select a `Location` and select `Lite` instance and click Create

2. Login to IBM Cloud Pak for Data

Step 2 : Import categories in IKC

From the Navigation Bar -> Governance -> Categories -> New category -> import from the file - MyBankCategories.csv . See sample csv below

Name,Artifact Type,Category,Description
My Bank,category,,Glossary for My Bank
Customer Details,category,My Bank,Category for individual customers
Address Information,category,My Bank>>Customer Details,Location related glossary for My Bank's customer
Insurance,category,My Bank,Glossary for My Bank Insurance Line of Business
Mortgage,category,My Bank,Glossary for My Bank Mortgage Line of Business

Step 3 : Import business terms for the Bank category

From the Navigation bar -> Governance -> Business terms -> Import from file -> MyBankBizTerms.csv . See sample csv below

Name,Artifact Type,Category,Description
Actual Number Of Customers,glossary_term,My Bank,Number Of Customers
Auto Finance,glossary_term,My Bank,Automobile Finance
Credit Card,glossary_term,My Bank,A credit card is part of a system of payments named after the small plastic card issued to users of the system.
Credit Risk,glossary_term,My Bank,The possibility that either one of the parties to a contract will not be able to satisfy its financial obligation under that contract.
Credit Risk Score,glossary_term,My Bank,Measure of the relative credit risk
Customer Attrition Analysis,glossary_term,My Bank,
Division,glossary_term,My Bank,JK Life and JK Wealth are considered divisions of JKLW
Earned Premium,glossary_term,My Bank,That portion of a policy’s premium that applies to the expired portion of the policy.
Home Equity,glossary_term,My Bank,Home equity is the value of a homeowner's unencumbered interest in their property.
Net Written Premium,glossary_term,My Bank,Written premium less deductions for commissions and ceded reinsurance.
Reinsurance Ceded,glossary_term,My Bank,"That portion of a risk that an original insurer (also known as a ""primary"" insurer) transfers to a reinsurer in return for a stated premium."
Socio Economic Category,glossary_term,My Bank,
Source System Name,glossary_term,My Bank,Name of the source system
Unearned Premium,glossary_term,My Bank,"That portion of the policy premium that has not yet been ""earned""."
Unique ID On Source System,glossary_term,My Bank,Unique identifier that was used in the original source system applying to the insurance LOB
Written Premium,glossary_term,My Bank,The premium registered on the books
Age,glossary_term,My Bank>>Customer Details,
Age Group,glossary_term,My Bank>>Customer Details,
Country Of Birth,glossary_term,My Bank>>Customer Details,
Country Of Citizenship,glossary_term,My Bank>>Customer Details,
Customer ID,glossary_term,My Bank>>Customer Details,Unique Identifier for a Life Insureance customer
Customers,glossary_term,My Bank>>Customer Details,Customer information
Date of Birth,glossary_term,My Bank>>Customer Details,Date of Birth of the Insurance cusotmer
Effective Customer Date,glossary_term,My Bank>>Customer Details,
Email Address,glossary_term,My Bank>>Customer Details,Customer Email Address
Employer ID,glossary_term,My Bank>>Customer Details,
Emp Status Type,glossary_term,My Bank>>Customer Details,
First Name,glossary_term,My Bank>>Customer Details,
Gender,glossary_term,My Bank>>Customer Details,"Customer's gender, if known"
Health Status Type,glossary_term,My Bank>>Customer Details,
Income Segment,glossary_term,My Bank>>Customer Details,
Last Name,glossary_term,My Bank>>Customer Details,Customer's surname
Lifecycle Type,glossary_term,My Bank>>Customer Details,
Life Insurance,glossary_term,My Bank>>Customer Details,Life Insurance Customer
Market Segment,glossary_term,My Bank>>Customer Details,Customer Market Segment
Middle Name,glossary_term,My Bank>>Customer Details,Customer's middle name or initial
Performance Status,glossary_term,My Bank>>Customer Details,Customer Performance Status
Relation Age Segment,glossary_term,My Bank>>Customer Details,
Summary,glossary_term,My Bank>>Customer Details,Summary information about a JKLW insured customer
Tax ID,glossary_term,My Bank>>Customer Details,Tax ID used for reporting income
Year End Process,glossary_term,My Bank>>Customer Details,Year End Process
Address part 1,glossary_term,My Bank>>Customer Details>>Address Information,
Address part 2,glossary_term,My Bank>>Customer Details>>Address Information,
Continuity Of Address Segment,glossary_term,My Bank>>Customer Details>>Address Information,
Country Of Residence,glossary_term,My Bank>>Customer Details>>Address Information,
Customer City,glossary_term,My Bank>>Customer Details>>Address Information,Customer City
Customer House Label,glossary_term,My Bank>>Customer Details>>Address Information,House number with optional suffix
Customer State,glossary_term,My Bank>>Customer Details>>Address Information,Current state of residence for a customer
Customer Street Name,glossary_term,My Bank>>Customer Details>>Address Information,Current street name for customer's address
Customer Street Suffix,glossary_term,My Bank>>Customer Details>>Address Information,Current suffix for street for customer's address
Customer Zipcode,glossary_term,My Bank>>Customer Details>>Address Information,Current zip code for customer's address
Amount insured,glossary_term,My Bank>>Insurance,The insured value. Normally represents the amount of the guarantee
Policy Number,glossary_term,My Bank>>Insurance,Unique identifier for a JK Insurance Policy
Policy Premium,glossary_term,My Bank>>Insurance,Payment for insurance
Amortising loan,glossary_term,My Bank>>Mortgage,The formal term for a standard principal and interest loan.
Arrears,glossary_term,My Bank>>Mortgage,Being overdue in repayments.
Asset,glossary_term,My Bank>>Mortgage,An item owned with a monetary value (eg. cash and/or property).
Bona-fide,glossary_term,My Bank>>Mortgage,Genuine and above board.
Break cost,glossary_term,My Bank>>Mortgage,Fees charged by the lender if the loan is paid-off in full before the end of the loan term.
Bridging finance,glossary_term,My Bank>>Mortgage,A temporary loan used as a gap measure between buying your new home and selling the old one.
Budget,glossary_term,My Bank>>Mortgage,A detailed review of your income and expenses.
Capital gains tax,glossary_term,My Bank>>Mortgage,Tax payable on the profit made when selling an investment property.
Cash advance,glossary_term,My Bank>>Mortgage,"A loan on a personal line of credit, typically a credit card attracting higher-than-normal interest."
Certificate of title,glossary_term,My Bank>>Mortgage,Document showing who owns the property as well as all the associated details of size and whether there is a mortgage registered on the title.
Comparison rate,glossary_term,My Bank>>Mortgage,A rate which includes fees and charges so loans can be compared on an equal basis (eg. a loan with a low advertised rate but high fees might cost the same as a loan with a higher advertised rate but low fees).
Contract variation,glossary_term,My Bank>>Mortgage,Any variation or alteration to the terms of a contract.
Conveyancing,glossary_term,My Bank>>Mortgage,Legal work carried out by your legal representative to transfer ownership of a property.
Creditor,glossary_term,My Bank>>Mortgage,A person or organisation who loans money on the expectation it is to be repaid.
Credit,glossary_term,My Bank>>Mortgage,"An agreement whereby the borrower receives goods or money now, on the understanding it is to be repaid under set guidelines that commonly include an interest charge."
Credit report,glossary_term,My Bank>>Mortgage,"A report outlining an individual’s credit history, public records and any credit black spots."
Daily interest,glossary_term,My Bank>>Mortgage,Interest calculated on a daily basis. Most variable rate loans calculate interest on a daily basis.
Debit card,glossary_term,My Bank>>Mortgage,A bank access card used to make withdrawals from current funds in a bank account.
Debt,glossary_term,My Bank>>Mortgage,An amount of money owed by one person or organisation to another.
Debt consolidation,glossary_term,My Bank>>Mortgage,To combine one or more debts previously held separately into one merged amount.
Debt Servicing Ratio (DSR),glossary_term,My Bank>>Mortgage,"The Debt Servicing Ratio measures whether you can afford the mortgage payments. To calculate the DSR, the lender uses a number of factors to work out the amount of your income that is available to repay the debt."
Default,glossary_term,My Bank>>Mortgage,Failure to make a loan repayment by a specified date.
Deferred payment,glossary_term,My Bank>>Mortgage,An agreement between two parties where the amount due to be paid on a given date may be postponed until a later date.
Deposit,glossary_term,My Bank>>Mortgage,Amount given in advance to show intention to purchase a property.
Deposit bond,glossary_term,My Bank>>Mortgage,An insurance policy to cover the deposit on a property being purchased.
Depreciation,glossary_term,My Bank>>Mortgage,"The amount claimed on an investment property for the reduction in the value of an item due to usage, passage of time, wear and tear."
Equity,glossary_term,My Bank>>Mortgage,The difference between your mortgage and your property’s value.
Fixed interest,glossary_term,My Bank>>Mortgage,"Your interest rate is locked in for a fixed term, you are then protected against possible interest rates rises for the selected ‘fixed’ term period."
Gearing,glossary_term,My Bank>>Mortgage,Investment property is negatively geared when expenses exceed rental income. Investment property is positively geared when the rental income received is greater than the total amount of the expenses.
Hardship variation,glossary_term,My Bank>>Mortgage,It may be possible to vary the terms of your contract should you find yourself in a position where you are having difficulty meeting your repayment obligations.
Lender,glossary_term,My Bank>>Mortgage,A person or organisation who provides money to another under the proviso that it will be repaid according to set guidelines and terms.
Lender’s Mortgage Insurance (LMI),glossary_term,My Bank>>Mortgage,Lender’s Mortgage Insurance is a once off insurance premium that protects the lender in the event you default on your mortgage repayments.
Liquid assets,glossary_term,My Bank>>Mortgage,"Are assets, either in cash or easily convertible to cash."
Loan to Value Ratio (LVR),glossary_term,My Bank>>Mortgage,"The value of the loan divided by the value of the property that the loan is for (eg. if you buy a $500,000 property and need a $350,000 loan – your LVR is 70%)."
Mortgage,glossary_term,My Bank>>Mortgage,"A loan for the purpose of purchasing a property, where the property is used as security."
Mortgage foreclosure,glossary_term,My Bank>>Mortgage,Where the lender forces the sale of the property held under the deed of mortgage in order to recoup unpaid monies owed under the terms of the agreement.
Mortgagee,glossary_term,My Bank>>Mortgage,The lending institution.
Mortgagor,glossary_term,My Bank>>Mortgage,The borrower (you).
National Consumer Credit Protection,glossary_term,My Bank>>Mortgage,Australian legislation covering consumer protection and consumer rights.
Non-conforming loans,glossary_term,My Bank>>Mortgage,Designed for those who find it more difficult to meet the borrowing conditions of standard loans.
Offset account,glossary_term,My Bank>>Mortgage,An offset account is an account linked to your mortgage. The balance in the account ‘offsets’ the principal of the loan. Overall interest is calculated on the principal less the offset account balance.
Ombudsman,glossary_term,My Bank>>Mortgage,Independent body established within a particular industry to investigate and resolve disputes as an outside party to the dispute.
Principal,glossary_term,My Bank>>Mortgage,The amount of capital borrowed.
Refinance,glossary_term,My Bank>>Mortgage,"Switching your loan from one product (or lender) to another, usually with a better interest rate or conditions. Your initial loan is paid out and your debt is transferred across to the new product or lender."
Repossess,glossary_term,My Bank>>Mortgage,To reclaim possession of goods or assets for failure to make payments within agreed terms.
Secured loan,glossary_term,My Bank>>Mortgage,"In this type of loan, the property being purchased is held as security against the loan."
Settlement,glossary_term,My Bank>>Mortgage,The day on which the process of changing title of a property occurs. Your legal representative will organise for the exchange of money and documents so that you become the legal owner of the property.
Unsecured loan,glossary_term,My Bank>>Mortgage,"A loan in which no property is held as security, generally attracting a higher rate of interest due to increased risk on the part of the lender."
Valuation,glossary_term,My Bank>>Mortgage,An estimation of the value of the property prepared by an independent professional valuer.
Variable interest rate,glossary_term,My Bank>>Mortgage,"The interest rate will vary depending on several factors, including the Reserve Bank’s current cash rate, and prevailing lender sentiment."
Vendor,glossary_term,My Bank>>Mortgage,The person who is selling the property.

Step 4 : Publish the Business terms from the workflow tasks

From the Navigation bar -> Governance -> Task Inbox → Select the task Publish Business terms

After publish , You should be able to view these business terms.

Step 5 : Create a Project

  1. Go to Navigation MenuProjectsView all projects
  2. Click on New project, provide name say -SALProject

Step 6 : Create a Metadata Import (MDI) job to import assets to the project

  1. Go to the newly created project and under Assets tab click New assets, search for Connect to a data source then find a `IBM Db2 on Cloud`
  2. Please fill in the required fields for a sample db which contains a bank schema .

Db2: Db2Connection
Database: bludb
Hostname: 2046b15d-11b0-4572-a3e1-e36c746b050b.c8l9ggsd0kmvoig3l8kg.databases.appdomain.cloud:30695
Port: 30695
Username: 323ecbd6
Password: 2jH7FnlR86F6xmBO
SSL enabled

3. Click New assets again and select Import metadata for data assets now
4. Provide a name → Select the created project as the target
5. Select scope → Click Db2ConnectionBANKDEMO → Select all the tables
6. Click NextCreate a MDI job to add the assets to the project, continue to take defaults and execute the job.

Step 7 : Set the thresholds for display name and gen ai description

  1. Go to the project, ManageMetadata Enrichment in the left pane
  2. You should be able to set toggles for `Display names`, `AI generated description` . Here, we have set to 100% match.

3. Turn on `Semantic term assignment` method for term assignment methods on the same page.

Step 8 : Start a Metadata enrichment (MDE ) job to view semantic enrichment
1. Create a MDE job to enrich previously imported tables `address` and `staff`:
Go to AssetsNew assetsEnrich data assets with metadata
— Provide a name and click Next
— Select data from project → Data asset → Select asset you imported from DB → Next
— Select the `Expand metadata` with `Assign terms` objectives
— Click Select categories_ → Select My BankNext
— Provide a job name → Select All data assets → Click Next
— Create a Metadata Enrichment

Step 9 : View the semantic enrichment options — Display name and Description

“Display name” and “Description” columns are visible. Some suggestions are automatically assigned. AI Icon is visible if it is not accepted yet.

The next section will showcase how semantic expansion works even for cryptic asset names and columns

Step 1 : Create a category in IKC say HRCategory

Step 2 : Create some business terms

For the category create some business terms such as Business Process Modeling Language , Customer Relationship Management Business, Object Repository , Business Application Programming Interface .

Step 3 : Create a project and import assets

Import the assets. Then Drag and drop these sample xlsx — containing the following columns.

anul cptr tms txn prn idy qtr
1 2 3 4 parent identifier quarter
BPML CRM BOR BAPI
1 2 3 4

Step 4 : Open the assets and run profiling on them from the profile tab

Step 5 : Run a Metadata enrichment job

Run a MDE job and choose “expand metadata” and “assign terms” as the selected options. Select categories and execute.

Step 6 : View the display name and generated description

Once the job successfully completes, the assets and columns have the display names and descriptions assigned .

--

--

Rakhi Arora

Semantic Automation , Data Privacy - IBM Knowledge Catalog, Cloud Pak For Data