Get rid of your digital clutter with watsonx.ai and CLI for GenAI

Rafał Bigaj
6 min readMay 10, 2024

--

Organize your documents with IBM watsonx.ai and CLI for GenAI

Written by: Rafał Bigaj, Jacek Midura, Rafał Maciasz

Photo by Wesley Tingey on Unsplash

Join me on an exploration of how to use IBM tools for generative AI to get your digital life in order.

Clara and her sad document story

Here is a very short story told me by the new Granite Base 13 Billion Model Chat (granite-13b-chat-v2) large language model (LLM):

Once upon a time, in the bustling city of New York, lived a woman named Clara. Clara was a meticulous person, always organized and prepared for every situation. However, one day, her life took a turn for the worse when she found herself drowning in a sea of paperwork. Clara’s office was a mess, filled with stacks of papers, bills, and documents of all shapes and sizes. It was as if the weight of the world was on her shoulders, and she couldn’t find the time to sort through it all. The clutter had taken over her life, and she felt overwhelmed and stressed.

One day, while looking at her ever-growing pile of paperwork, Clara realized that she needed to take action. She began by categorizing the documents into different folders, separating them into important, semi-important, and unimportant categories. This process took time, but it was necessary to regain control of her life…

GenAI as the hero of the story

If you see yourself in the sad tale of Clara and her sea of paperwork, take heart that the same LLM that can generate a happy ending for the fictional tale of Clara can also deliver a happy ending for your organizational struggles in real life. Let’s explore how to use an LLM with the command line interface (CLI) for watsonx.ai to organize documents and address your real-world clutter.

One of the better-known techniques for managing digital documents is Named Entity Recognition (NER), a natural-language processing method used with unstructured text for locating, extracting, and classifying named entities into pre-defined categories such as person names, organizations, and locations. You can use this technique to automatically categorize a collection of documents. For example, imagine hundreds of legal contracts, all written in formal, legal language, that you were probably saving for some light reading (after you finish your favorite software manual!) Let’s step through how we can prompt an LLM to help us manage our dense collection.

As you know, the primary way to interact with an LLM is to create a prompt, in text form, where you express what kind of answer you want the model to construct. In this example, I want to design a prompt that will tackle my collection of contracts and categorize them by date, service, and organization.

Let’s start with PromptLab for IBM’s watsonx.ai, where you can experiment manually and build the initial version of a prompt using a structured view.

The first version of the prompt might look like this:

Extract all listed fields using information from CONTRACT provided below.
Return the result in JSON format including only listed fields.
Fields:
Service
Organization
Address
Date
Client

You can try it out on the sample contract from:
https://raw.githubusercontent.com/IBM/cpdctl/master/samples/watsonx.ai/hvac-services-contract.txt

The expected response from Granite Base 13 Billion Chat (granite-13b-chat-v2) model is:

{
"Service": "HVAC services",
"Organization": "New York HVAC Services",
"Address": "303 Fremont Dr., Brooklyn, New York, 11203",
"Date": "October 24, 2023",
"Client": "International Business Machines Corporation (IBM)"
}

That’s progress!! We can already see how we can identify the key pieces of text to extract and classify. Next, let’s try to repeat the same process using IBM Cloud Pak for Data Command Line Interface (IBM cpdctl). If you want to try this yourself, download the CLI from https://github.com/IBM/cpdctl.

Next stop: prompting using the CLI

After you install the CLI and configure it to work with your IBM Cloud account, run this command to turn on the feature for experimenting with watsonx.ai:

export CPDCTL_ENABLE_WATSONX=true

To enter the prompt, put the prompt text and the expected field names in a variable:

PROMPT_TEXT="Extract all listed fields using information from CONTRACT provided below.\nReturn the result in JSON format including only listed fields.\nFields:\nService\nOrganization\nAddress\nDate\nClient\n\nCONTRACT:\n$(< hvac-services-contract.txt)\n\nOutput:"

Now you can use the “wx-ai text generate” command to get named entities returned in JSON format. There are plenty of parameters provided to that command:

  • ID of the large language model (e.g. ibm/granite-13b-chat-v2)
  • Parameters that control the model and response
  • Properties that control the moderations, for usages such as ‘Hate and profanity’ (HAP) and ‘Personal identifiable information’ (PII) filtering

In this example, I’m using greedy decoding method with 200 new tokens limit, which is sufficient for those few extracted fields. Parameters looks like:

{
"decoding_method": "greedy",
"max_new_tokens": 200,
"repetition_penalty": 1
}

Additionally, moderation settings allow me to detect and remove any hate or profanity from input and output. I need to extract personal information like address or telephone number, so I’m not specifying any options for PII data. Moderations are following:

{
"hap": {
"input": {
"enabled": true,
"threshold": 0.5,
"mask": {
"remove_entity_value": true
}
},
"output": {
"enabled": true,
"threshold": 0.5,
"mask": {
"remove_entity_value": true
}
}
}
}

To run the final command you have to provide ID of your watsonx.ai project instead of “{project_id}”:

cpdctl wx-ai text generate --input "$PROMPT_TEXT" --project-id '{project_id}' --output json \
--model-id ibm/granite-13b-chat-v2 \
--parameters='{"decoding_method": "greedy", "max_new_tokens": 200, "repetition_penalty": 1}' \
--moderations='{"hap": {"input": {"enabled": true, "threshold": 0.5, "mask": {"remove_entity_value": true}}, "output": {"enabled": true, "threshold": 0.5, "mask": {"remove_entity_value": true}}}}'

This video explains how to create a new project: https://video.ibm.com/recorded/132861278

The output from the command is exactly the same as the output from the PromptLab, where I designed my prompt text originally:

{
"Service": "HVAC services",
"Organization": "New York HVAC Services",
"Address": "303 Fremont Dr., Brooklyn, New York, 11203",
"Date": "October 24, 2023",
"Client": "International Business Machines Corporation (IBM)"
}

This is a good example of named entities being identified and extracted successfully.

To recap, we now have two methods for identifying, extracting, and classifying named entities. The first might be the efficient way to run an ad hoc task, where the CLI method provides a means for you to integrate the process in a recurring task. Let’s explore some possible next steps.

Building a reusable prompt

Now that you have confirmed that the prompt is generating the correct output, you might say “It would be great if I could save the part of input that stays unchanged for all documents”. Okay, so let’s do that. With a few modifications, we can save the prompt as a prompt template for re-use. In the CLI we use “wx-ai prompt create” to define the template, which we name extract-contract-details.

Replace “{space_id}” with ID of a deployment space in the command below. For details on creating a deployment space, see watsonx.ai documentation.

cpdctl wx-ai prompt create --name "extract-contract-details" \
--space-id '{space_id}' \
--prompt-model-id ibm/granite-13b-chat-v2 \
--prompt-data '{"instruction": "Extract all listed fields using information from CONTRACT provided below.\nReturn the result in JSON format including only listed fields.\nFields:\n{fields}\n\nCONTRACT:\n{contract}\n\nOutput:"}' \
--prompt-variables '{"fields": {"default_value": "Service\nOrganization\nAddress\nDate\nClient\n"}, "contract": {}}'

...
ID: f0f08bd4–1442–42f3-be39-a28a862324fa
Name: extract-contract-details

Let’s review the operation:

  • We are still prompting the Granite Base 13 Billion Chat model.
  • The prompt text remains the same, except that the identification and extraction fields are now structured as variables, allowing for re-use over a changing set of documents.
  • The output for the prompt template returns the name of the template (extract-contract-details) and the ID for persisting the prompt.

Put the prompt template to work on clean-up detail

We are ready to make use of the prompt template and deploy it as a real-time inference endpoint. In watsonx.ai you can use a deployment ID or a given name to identify any scoring endpoint. The same is true for deployed prompts. The following code demonstrates how to deploy a prompt template. In this case, we are deploying “extract_contract_details_granite_13b” to get an endpoint.

cpdctl wx-ai deployment create --prompt-template-id "f0f08bd4–1442–42f3-be39-a28a862324fa" \
--name "extract-contract-details-using-granite-13b" \
--space-id "{space_id}" \
--online-parameters '{"serving_name": "extract_contract_details_granite_13b"}' \
--base-model-id "ibm/granite-13b-chat-v2"

Now you get the endpoint and use this prompt template in a recurring task or app to routinely review your files and sort by the specified criteria, restoring organization and sanity to your digital life. It is now a snap to find the contract you want to pack for some light beach reading!

Summary: Get organized with GenAI

We have demonstrated how you can use watsonx.ai and the IBM Cloud Pak for Data Command Line Interface (IBM cpdctl) to prompt a large language model to identify, extract, and classify documents according to criteria you specify. You can then save the prompt as a reusable prompt template and deploy the prompt template to get an endpoint. Use the endpoint to run your prompt as needed. For more information on how watsonx.ai delivers the tools you need to put generative AI to use, see https://www.ibm.com/watsonx

--

--

Rafał Bigaj

System Architect with long successful record of building and leading teams. Broad and practical knowledge in the area of cloud computing and machine learning.