Protecting confidential customer data with private variables in watsonx Assistant

Private variables prevent confidential customer data from being stored in conversation logs or returned to the client application to ensure privacy is always maintained

Eun Young Ha
IBM watsonx Assistant
8 min readJun 25, 2024

--

When virtual assistants are built, there are often situations in which they need to collect confidential data from customers, such as PII (e.g., a social security number), or authentication tokens to a web application. Storing customer or business sensitive data in conversation logs can present security concerns ranging from inadvertent exposure of user/business data to malicious exploitation of data by third parties with nefarious intent, or just plain mistreatment of private data that can lead to legal concerns.

To address this, watsonx Assistant recently released a new feature in the called private variables, which offer the following capabilities to enterprises:

  • Privacy: The content of variables designated as private is not stored in watsonx Assistant conversation logs. Any reference to the private variable is masked in customer input and the virtual assistant’s response.
  • Security: The raw values of the data designated as private are not returned to the client application. In the case of those using the stateful message API the content is simply not returned, while those using the stateless message API will see an encrypted version of the data which cannot be decrypted by the client.
  • Flexibility: Assistant builders can decide which variables should be designated as “private”. Session variables and/or action variables may be designated as private. By default all variables are considered to be “public” (not private).
  • Serviceability: The raw value of a private variable may be viewed in the watsonx Assistant preview UI and only in that UI. Assistant builders can debug flows that use private variables with the same efficiency as flows with public variables.

Designating confidential data as private

In watsonx Assistant’s actions, variables are declared in one of three categories: application variables, session variables, and action variables. While application variables are created by the application itself and hold application-level information (e.g., current time, user’s timezone, etc.), session variables and action variables are created by the assistant builder and can be used to store the data collected from a customer.

Session variables persist for the duration of a customer session, which may involve more than one action. Action variables persist for the lifecycle of the action they belong to, so once an action ends, all action variables associated with that action are removed from the context of the conversation. Private variables can be applied to both session and action variables.

Consider a scenario where a conversation designer is authoring an action “Transfer balance” and creates a step that asks a customer for an email. The email provided by a customer will be stored in an action variable associated with the step. The conversation designer can designate the customer’s email as private by going to the step’s Customer response settings and selecting Protect data collected at this step option, which will configure the action variable associated with the step as a private variable.

Designating a customer data collected by a step as private

If the conversation designer decides to save the customer’s email to a session variable to make it available for other actions during the same customer session, the session variable should be explicitly configured as a private variable. Otherwise, the session variable will not be treated as a private variable even though its value came from a private action variable.

A session variable can be configured as a private variable by selecting Protect data stored in this variable option

Masking mentions of private data in a conversation

The interactions between customers and a virtual assistant are stored in watsonx Assistant’s conversation logs. Before logging the conversation, watsonx Assistant replaces all references of private data appearing in a customer input or in the virtual assistant’s response with ******.

Suppose the following interaction between a customer and a virtual assistant, where the customer’s email is designated as a private data.

Assistant: "To log in, can you provide your email address?"
Customer : "My email is email@email.com"
Assistant: "Thanks for providing your email email@email.com".

In the conversation logs, all mentions of “email@email.com” are masked as ****** both in the customer input and in the virtual assistant’s response as shown below.

In watsonx Assistant’s conversation logs, mentions of private data in a customer input or in the virtual assistant’s response are masked as ******.

When a virtual assistant carries out conversations with a customer, it sends the customer input to watsonx Assistant’s message API endpoint and receives a response. If a customer input includes private data, watsonx Assistant adds a masked input to its response, where all mentions of the private data are replaced with ******. Likewise, if the virtual assistant’s response includes mentions of private data, the masked output is included in the endpoint’s response along with the normal (i.e., non-masked) output. Refer to the API docs for the exact specifications.

Hiding private data in the context of the conversation

When a customer starts a new session, watsonx Assistant creates a context of the conversation to store session-specific data that needs to be referred across conversation turns. For example, session variables and action variables are stored in the conversation context. Watsonx Assistant offers two types of message API endpoints depending on how the conversation context is handled: stateful message and stateless message.

For the stateful message, the context of the conversation is maintained by watsonx Assistant for the duration of the session. By default, the conversation context is not included in the stateful message endpoint’s response. However, a client application can choose to include it in the endpoint’s response by setting the return_context option to true when it sends an API request to the endpoint. In such a case, watsonx Assistant omits private session variables and private action variables from the conversation context returned in the response.

One exception is when the client application is watsonx Assiatant’s actions preview pane. The actions preview pane uses the stateful message endpoint, but private variables are returned to it with their raw values. It is an intentional design point to help builders validate and debug the action they are authoring.

Unlike the stateful message, the conversation context is not maintained inside watsonx Assistant for the stateless message. Instead, the conversation context is always included in the endpoint’s response. When private variables are present in the conversation context, watsonx Assistant encrypts their values in the response, so the raw values of the private variables are not returned to the client application.

The encryption and decryption mechanism is kept confidential inside watsonx Assistant, however, no contemporary encryption technology guarantees 100% protection. It’s worth noting that the stateful message provides a more secure means of protecting confidential customer data as it does not return private variables in the conversation context to the client application.

When watsonx Assistant logs a conversation, it also logs the conversation context, but it removes all private variables from the conversation context before logging the conversation, no matter whether the original message API request was made using the stateful message or the stateless message endpoint.

Private result variables for subactions and extensions

The result of subaction and extension calls may also be designated as private. When actions make a subaction or extension callout, the result of the callout is stored as an action variable. The assistant treats these variables in the same way as all other action variables. When they are designated as private, the result of the subaction or extension callout is given the same treatment as the private variables described in earlier sections.

Designating the result of a subaction callout as private
Designating the result of an extension callout as private

If the result of a subaction call is designated as private, watsonx Assistant considers the entire result as private. Suppose a main action calls a subaction consisting of three steps that collect customer’s name, email, and address, where the email and address are designated as private but name is not. Once the subaction finishes and the result is returned to the main action, the customer’s name is also treated as a private data within the main action because it considers the entire result of subaction call as private.

During an extension callout, watsonx Assistant appends detailed metadata about the extension request and response to the message API response. This metadata is added to allow the conversation designer to better debug potential issues with the extension callout. This metadata is only returned to the client application when the client is the actions preview within watsonx Assistant. This metadata is never passed to other client applications or watsonx Assistant’s conversation logs.

Hiding private data in coexistence

It should be noted the private variables feature described so far is only available as a part of watsonx Assistant’s actions framework. If a virtual assistant employs both actions and dialog (a.k.a. coexistence) and an action is called out from inside the dialog, the private variables feature will not function as expected even if the confidential data is designated as private within the action.

The older dialog framework offers its own mechanism for hiding confidential information from the context of the conversation although its functionality is limited compared to what actions private variables offers.

In dialog, the data collected from a customer is saved in context variables. A conversation designer can prevent confidential customer data from being stored in watsonx Assistant’s conversation logs by storing it in a context variable nested within the $private section of the conversation context, for example $private.action_result_1 as shown below.

The data stored in such a way is not stored in watsonx Assistant’s conversation logs. However, they are still returned to the client application with their raw values in the message API response and the mentions of the private data in a customer input or in the virtual assistant’s response are not masked in watsonx Assistant’s conversation logs.

Designating confidential data as a private context variable in Dialog

Conclusion

This article explains how confidential customer data can be protected in watsonx Assistant actions using the private variables feature. Once designated as private data by the conversation designer, the confidential information collected from a customer are neither stored in watsonx Assistant’s conversation logs nor the raw values of the private data are returned to the client application.

Before concluding, it should be clarified that the private variables feature acts only on the data that the virtual assistant expects to be provided by the customer. If a customer enters a confidential information that’s not asked for by any step in a given action, that data is not protected. For instance, if a customer starts a conversation with a virtual assistant by saying “Hi, my name is Jon Doe and my social security number is 123–34–1234”, which triggers an action but the triggered action does not have a step that collects a social security number, the entered social security number is not masked in the conversation logs because the virtual assistant does not have a means to recognize it as a private data.

If assistant builders are concerned about such scenarios, they need to seek another solution, such as adding a pre-hook that scans the customer input for common patterns (e.g., account IDs, phone numbers, etc.).

--

--