watsonx.governance is now available

Published in

IBM Data Science in Practice

5 min readDec 5, 2023

Generative AI is dominating the headlines, and with it comes questions about AI Governance. The new and amplified risks of generative AI along with new regulations have meant enterprises have been looking for solutions to these challenges. And that is where watsonx.governance comes in.

watsonx.governance is the next step in our watsonx story and brings aspects of trust and explainability, with new features to govern LLMs. watsonx.governance is the introduction of one unified platform to govern both Generative AI and Predictive ML.

watsonx.governance Essentials Cloud Plan

Our first release of watsonx.governance is the launch of our Essentials Cloud Plan. The plan includes three key features:

AI Transparency with a model inventory and AI Factsheets capabilities
Monitoring and alerts with settable thresholds to provide auto-monitoring
The ability to govern both LLMs & predictive ML together

AI Transparency Features

Let’s discuss the new features to unlock AI transparency, specifically model inventory and AI Factsheets.

A model inventory is like a map for AI within your organization. With a model inventory, data scientists and prompt engineers can organize models by use case, visually identifying where they are within the AI lifecycle. The state of the model (or models) are all organized within stages, all organized by use case. The model inventory also highlights to data scientists or prompt engineers when an action, such as investigating when a model metric is beyond its threshold, is required. This saves individual time and helps with team collaboration. The model inventory ensures AI process repeatability, reliability, and efficiency by becoming the “home page” of the AI Lifecycle for both LLMs and Predictive ML.

Model Inventory for an LLM Prompt in development, ready to be moved to Test stage

Complementing the model inventory to improve AI transparency, we are also introducing AI Factsheets for LLMs. AI Factsheets automatically document model metadata in an always up-to-date Factsheet for a single source of truth. Unlike traditional model cards, AI Factsheets are structured and automatically updated without manual effort. The latest prompts, metrics, model details, health scores and more are automatically logged. This saves time, addresses policy & regulatory requirements for model documentation, and provides an easy way to share details of an AI model to any stakeholder with just a click. A huge time-saver while supporting AI Transparency.

Example of an AI Factsheet for an LLM Prompt

Together model inventory and AI Factsheets organize the AI development process while saving time for teams, all while meeting key documentation requirements for policies and regulations.

Monitoring Quality and Safety Metrics

The watsonx.governance Essentials cloud plan also offers evaluation and monitoring of LLMs. We are launching with the ability to evaluate and monitor quality and safety LLM metrics. We’re introducing a variety of quality metrics for text summarization, classification, content generation, Q&A, and entity extraction use cases. We are also offering the ability to monitor LLM drift, essential for trusting an LLM once it is in production. And we are also introducing safety monitors of toxic language and personal identifiable information for both the input and output of prompts. These scores can be calculated during development time, or while a model is in production. They can also be calculated on demand, or automatically on a schedule. In either case, AI Factsheets and the model inventory are automatically updated with the latest scores. These features will improve prompt quality during development and ensure models do not need babysitting, as data scientists are automatically alerted for any thresholds breached.

Evaluating quality metrics using watsonx.goverance for an LLM Prompt

Evaluating quality metrics using watsonx.governance for an LLM Prompt

That’s not all: we are also introducing model health for LLMs to track metrics such as data size, latency, throughput, records, and number of users to identify bottlenecks and compute intensive workloads. It is a great tool to ensure models are operating with the right resources for their usage. Like the quality and safety metrics, these are also automatically logged in the AI Factsheet and model inventory for AI Transparency.

Model Health for an LLM Prompt (that is brand new)

The last major feature we are introducing for LLM Governance is attribution for LLMs. As a reminder, the RAG Q&A use case allows prompt engineers to add context or reference data to improve prompt output. Attribution for LLMs adds the the ability to attribute how much the output relies upon that reference data — like a citation score. With this feature, prompt engineers can understand how context data influences output. It provides a valuable source of data especially when dozens or hundreds of examples of context or reference data are provided. It is a fantastic understanding and validation tool included in the watsonx.governance Essentials plan.

Powerful Features for Predictive ML Governance

Lastly, we are offering a powerful set of capabilities for Predictive ML. This includes:

Local and Global Explanations
Fairness Monitoring
The ability to automatically de-bias unfair models
What-if Analysis
Drift monitoring, AI Factsheets, Model Inventory, Quality monitoring and more

Pricing

We started with the idea of making pricing as simple as possible. We are simply charging for evaluations and explanations. These are priced per Resource Unit. We’re not charging number of users, up-charging for LLM prompt governance, or even requiring a minimum monthly fee. Simply per evaluation or explanation. Super simple.

Try now

Everyone can get started with watsonx.governance today for free. A Lite plan includes access to watsonx.ai so you can try prompt engineering any of the included models, then use watsonx.governance to evaluate for quality and safety metrics, create an AI Factsheet, and track progress in a model inventory. You can even put a model into production, and monitor the model for drift without baby-sitting. Try it now @ https://www.ibm.com/products/watsonx-governance

#IBM #watsonx #AIGovernance #Explainable Ai #ResponsibleAI #AI #Transparnecy