NinjaChat Basics with Dialogflow

Published in

Ninja Van Tech

6 min readNov 5, 2020

This article is part two in a series of articles (previous part #1) about how we set up a chatbot — NinjaChat — to serve our shippers and consignees here at Ninja Van. This second installment continues the discussion from before but in greater detail. It covers the functional aspects of our bot and the way Dialogflow was used to build it via its rich set of features such as intents, contexts and actions.

Using Intents & Contexts to Control Conversation Flow

An intent “categorizes an end-user’s intention for one conversation turn”. It describes where a user will end up in a conversation tree after sending one utterance to our bot. When a user utters something to our bot, that utterance is used to matched against a list of intents (that are themselves trained on utterances, or training phrases). The matched intent, along with its accompanying configured response and action, is returned to SNS from Dialogflow and subsequently processed.

A typical conversation tree has a specific combination of one or more states preferentially leading to another specific combination of one or more states. This preferential traversal from one state to another is achieved in Dialogflow via active contexts which define the current state of the conversation (in addition to which intent is getting matched). Intents are set up to respond to certain active contexts.

Dialogflow — A typical intent as seen from the Console

Each intent can have one or more of the following:

Input Contexts (top bar under Contexts section) — Controls how likely an intent is matched. An intent is likely to be matched when all of its input contexts are currently active in the conversation.
Output Contexts (bottom bar) — Describes the contexts that will be activated once the intent is matched

With this, we can pretty much map out the entire conversation tree as sequence of intents and pathways branching out from an initial starting intent(s).

One last thing — contexts are configured with a lifespan calculated based on the number of conversational turns it can remain active (turns = number of times intents are matched). This lifespan is refreshed when an intent is matched and sets the same context as its output context.

(Hint: Notice how output contextcustomer-sns has a 50 lifespan whereas customer-main-menu has 0, indicating it is deactivated once the intent is matched)

Some interesting ways we have used contexts in Dialogflow for our chatbot:

Each exclusive type of user (unverified user, consumer and shipper) have their own overarching context (e.g. shipper-sns) which is accepted as an input context and re-fired as an output context for all associated intents. This ensures that an utterance from a shipper user can never match an intent meant for a consumer, and vice versa.
Certain flows require us to capture information from the user, validate that information and rely on that validated information for subsequent intent matches. One example is a flow for shippers to create appointments for their parcels to be picked up by our drivers. The whole sequence of intents prompt the user for their requested date, address etc. After each matched intent, we validate the captured information with external services, and if successful, manually update the main user context in Dialogflow using one of Dialogflow’s ContextsClient methods with a unique name for the parameter indicating that it is ‘validated’. This ‘validated’ variable associated with the overarching user context is later retrieved as needed. If not found when needed, an error response is shown to the user.
Lifespans for all contexts are set to reflect whether it is required (deactivated contexts are set as output context for an intent with lifespan = 0), will be required in the foreseeable future (with lifespan = 5), or belong to an overarching user context (set as lifespan = 50 meaning we never want it to expire, although arguably some other big-enough number would get the job done as well).

*As a special noteworthy mention:

Context management was an integral part of a feature we were working on called timelapse. Timelapse involved allowing the user to select an option from way back in his/her conversation history, and — as long as the option selected was shown during the current session — have the bot respond to the option as if nothing else had occurred since then, with all appropriate contexts and stored values active/present. (Just like old times.)

To accomplish this, we relied on the intent match metadata in our database that we persisted with each user-bot interaction . When an earlier option is selected, we scan through our history of interactions for this user and try to extract the metadata from the interaction just before the match.

Using this metadata, we call on Dialogflow’s ContextClient to restore all active contexts contained within the metadata we extracted, including stored parameter values. Finally, we trigger an intent match with the new option and process the result from the new match this time.

This allows users to jump right back into the main menu if they lose their way in our chatbot, track another order immediately without having to navigate back to that part of the conversation, or even return to the middle of order pickup creation once they have successfully created a pickup with our bot.

Using Actions to Produce Side Effects

After an intent is matched, SNS needs to, in a quasi-idempotent way, do more than just reply to the user with a pre-configured response. It needs to call services, update databases, process information, etc. This was achieved with Dialogflow’s actions, a value configured for each intent that is returned as part of the matched result when we call Dialogflow with user input. This value is interpreted by our service, in this case DialogflowIntentService and its derivatives for each user type, and processed with a handler method.

Dialogflow — Action Section on an Intent page

I wrote quasi-idempotent because the same action value would always lead to the same set of behaviors being performed on our service, and also the response from the bot more or less follows the same structure for given action.

More importantly, though, the burden of state management is placed primarily on Dialogflow, and SNS only needs to fulfill a preset series of expected behaviors whenever it encounters an intent match carrying action value x. A basic example for us would be the action we defined as CONFIRM_SELECTION, whose handler looks something like this:

private /* ... */ handleConfirmSelection(QueryResult queryResult) {
    String responseText = fromDFKey(
        queryResult.getFulfillmentText());
    DialogflowOptionContainer options = translate(CONFIRM_OPTIONS));
  
    return completedFuture(buildResponseBean(
        responseText, options));
}

The handler simply returns a response bean with the response text equal to what was configured for the matched intent, and displays CONFIRM_OPTIONS, which is a constant holding two options ‘Yes’ and ‘No’. (*Note: fromDFKey() and translate() are used to localize the actual text displayed to the user, based on the language set for the user’s country). If you’re interested, the main service responsible for delegating which handler method to call looks like this:

public /* ... */ processMatchedIntent(R request, Q queryResult) {
    return handleGeneralIntents(requestBean, queryResult)
        .orElseGet(() -> identifyAction(queryResult.getAction())
            .filter(intentFunctionMap::containsKey)
            .map(action -> matchedIntentMap
                .get(action)
                .apply(request, queryResult))
            .orElseGet(() -> {
                //nvLogger.warn(...);
                //return default response bean
            }));
}private /* ... */ initializeMatchedIntentMap() {
    matchedIntentMap.put(CUSTOMER_DISPLAY_MAIN_MENU,
        this::handleDisplayMainMenu);
    matchedIntentMap.put(CUSTOMER_DISPLAY_SUB_OTHERS_MENU, 
        this::handleDisplaySubOthersMenu);
    //... more map.put(action, handler)return map;
}

initializeMatchedIntentMap() is called to build a map of intent action values and their corresponding handler methods. For each request for a user of type CUSTOMER, we call processMatchedIntent() which first executes handleGeneralIntents(), a superclass method that attempts to match for general action values utilized by more than one intent service classes. Then, it attempts to match with a customer-specific action value and if a match is found, calls the corresponding handler method by accessing the customer intent function map.

This pattern is used for all intent service classes. The goal is that as user specific intents start to grow, we would also look into categorically moving these intents into separate classes based on the flows they’re involved in.

That’s pretty much the way we’ve set things up. The next article is when I will discuss about broader considerations that come with bringing out bot to an international audience — localization.

NinjaChat Basics with Dialogflow

Using Intents & Contexts to Control Conversation Flow

Using Actions to Produce Side Effects

Written by Dexter Fong