How To Build A Stateful Bot

Alex Bunardzic
Bots For Business
Published in
8 min readJul 13, 2016

--

Photo by Milanka Bunard

I’ve been doing professional web development for almost 27 years. I’ve learned a lot by building web apps. I’ve switched to bot development, and have realized that it is much harder than the web development.

Why is bot development harder than web development? One reason — state! State is “all the stored information, at a given instant in time, to which the program has access”. Managing state on the web is a child’s play compared to managing it in bot’s memory. In this article we will delve a bit deeper into the state management challenges.

Statelessness Is Desirable

One of the reasons the web is the most successful computing platform ever is because it is stateless. The web consists of a huge number of programs (web servers). Primary responsibility of those programs is to process incoming requests. The main advantage of the web is that programs don’t have to maintain the state. Each request that a program processes disappears from the server’s memory. Web server has no access to that information, because it is not stored anywhere.

Why would that be desirable? The web architecture deals with unpredictable number of requests. These requests may pile up and many of them arrive at approximately same moment. The web server must be able to process all those requests. The web server must be able to scale. And the only way to do that is by minimizing the processing overhead.

If the web server were to keep track of the requests it had processed, it would soon collapse under its own weight. Even a not too busy web app takes a sizeable number of requests in a short time frame. Retaining all those incoming requests would tax server’s memory and would slow it down. In the more extreme cases, it would even crash the server.

Bots Are Stateless

Bots are stateless by default. Like a web app, a bot is also a computer program that lives on the network (the web). And like web apps, bots are also event-driven. While web apps receive requests (HTTP Request), bots receive messages. Once the message arrives, a bot processes it in the same way a web app processes requests.

Most bots available today discard the message once they process it. They don’t hang on to the message they’ve just processed. In that way, they are acting the same as web apps. Same as web apps, bots are stateless.

This baked-in statelessness makes bots even more scalable than web apps. This increased scalability comes from not having to juggle assets. Unlike web apps, bots don’t deal with markup tags, stylesheets, Javascript, binary content. Bots deal with text, which is tiny in size compared to web assets. Because of that, bots should be able to process more messages per second than web apps can process requests.

HTTP Requests And Bot Messages Are Not That Different

Both web apps and bots are event-driven so the way they handle incoming calls should not differ. Let’s examine the process of handling incoming calls:

A call arrives on the network node/port. A server program is listening for events on that port. The arrival of the call is an event that triggers the server program into action. The implemented code accepts the content delivered over the network and parses it. The programming logic will decide what execution pathway is appropriate. That decision will most likely depend on the details embedded in the content.

The thing that will differ in the above scenario is the content itself. A bot message consists of some text and a bit of sparse meta-information. HTTP Request consists of more meta-information. But if we peel back the meta-information from both calls, we may end up with a similar text message.

A simple example may help clarify. I’ll use CRAN (The Comprehensive R Archive Network) as an example.

Suppose a caller wishes to read “Frequently Asked Questions” (FAQ) that the service publishes. Let’s also assume that CRAN has a bot capable of handling incoming calls. Let’s say the bot’s hypothetical handle is @cran.

In the case when the bot @cran is listening on the network node, the caller can send the message “@cran faq”. The bot will parse the incoming message and will decide to fetch the FAQ and send it as a reply. Couldn’t be simpler.

How does the incoming call to the web site look like? The caller will have to send HTTP GET Request to this URI: https://cran.r-project.org/faqs.html. Upon receiving that request, the server will parse it. The server will see that it is a HTTP GET request. Furthermore, the server will see that the request mentions the faqs.html document. This knowledge will trigger the server to fetch that document. Once it fetches the document, the server will respond to the request by sending the doc to the caller.

Statelessness In Action

We’ve seen how both a web app and a bot handle incoming calls. Once they receive and process the incoming call, they respond/reply to the caller. Once the service responds, it moves on and forgets all about the event it had just handled.

If the caller then sends the same request (i.e. asks for FAQ again), the service obliges. If the caller keeps repeating the same request over and over, the service continues to oblige. The service never deviates from the knee-jerk processing. Let’s anthropomorphize that for a moment. The service goes “When I receive a request for FAQ, I respond with this content.” It never goes “wait a minute, you’ve already asked the exact same question!” The service has no opinion. The service does not judge. The service is stateless.

This is all cool and dandy for a web site or app. Users tend to perceive a web site/app as a mechanical contraption. Users don’t expect a web site to behave like an entity that possesses certain degree of awareness. But the problem is that users expect that quality from a bot.

If a bot keeps dishing out the exact same answer to identical questions in a row, it will appear dumb. People will not feel confident in using such a dumb bot. So unlike with web apps, bots cannot afford to be stateless.

Introducing Session/Conversation

As a concept, session is a crutch that helps stateless servers maintain the state. In the previous example, the client had requested a FAQ document. From the web app’s point of view, that’s pretty much a non-event. Also, from the user’s point of view, it is also a non-event.

But from a bot’s point of view, it is something the bot must keep track of. So bots are quite different from apps.

How is the bot going to keep track of the events that happen during the conversation? Bots have no other way but to hoist up a concept of session. In bot parlance, this session is better labeled as conversation.

Associate Conversation With A User

Each time a bot receives any message, it must first figure out who is the sender. Only two scenarios are possible:

  1. The sender (user) has never before talked to this bot
  2. The sender (user) has already talked to this bot

In the first case, it is bot’s job to record the event by creating a new user profile. On the heels of that the bot must create a conversation thread that belongs to the new user. The bot then must persist the new conversation thread. Finally, the bot must record the first message in the new conversation thread.

In the second case, it is bot’s job to find the conversation thread belonging to the identified user. Then the bot must record a new message in that conversation thread.

Of course, goes without saying that each record must be time stamped. That way, our bot can process messages in chronological, or arrival order.

Prepare A Response To The User

Stateless bots are only concerned with processing the message that had just arrived. Once they prepare the response to the message they’ve processed, they send their response. After sending the response back, they discard both the original message and the response. Stateless bots have the attention span of a goldfish.

In contrast, stateful bots never discard anything they’ve processed. Their job is to keep both the received messages and the sent responses. That way, they build a history of the conversation with the user. Stateful bots have unlimited access to the entire history of all the conversations. That way, they are successful in maintaining the state of the conversation.

This is an important distinction. It matters because it influences how is the bot going to respond to the user. With the stateless bot, the only challenge is to grok the current message. Stateless bot can determine if the message has positive or negative sentiment. It can also look for so-called keywords. For example, maybe a keyword/keyphrase is: “What do you know?” Or: “Tell me about” Or: “Find about”, etc.

Once a stateless bot trips the wire by detecting a keyword, it just acts upon it. If the keyword is “Tell me”, the bot will take the latter part of the message and will do an internet search on it. Then the bot will send the response consisting of the content found on the net. The bot may massage the content found during the search, to pretty it up a bit for human consumption.

Stateless bot is mindless. It will keep repeating the exact same response to the identical question. It has no way of knowing that the same user keeps repeating the question. And even if it had a way to know that, it wouldn’t care. Stateless bot has no opinions.

In contrast, stateful bot has plenty of opinions. For example, stateful bot would notice that the user keeps repeating the same question. It can do that thanks to its ability to keep track of the conversation. So how will the stateful bot do that?

Each time stateful bot receives a message, it tries to place that message in a context. A context could be simple — like comparing the message with the previous one. If they’re identical, the bot responds by asking if the user wants to get the same answer.

Right there, by observing the context, stateful bot projects the illusion of awareness.

Conclusion

We have seen how important it is to keep the state of the conversation in memory. Bots that don’t do that appear dumb and irritating. For a bot to be able to establish illusion of awareness, state is essential. And the best way to maintain state of the conversation is to store each message. Then once a new message arrives, compare it to previous messages to see if there is a pattern. If there is, your programming logic will decide how to prepare the response. Reacting to the user in that way will create a solid, reliable impression. People will then see your bot as being a valuable servant.

Intrigued? Want to learn more about the bot revolution? Read more detailed explanations here:

The Age of Self-Serve is Coming to an End
Only No Ux Is Good UX
Stop Building Lame Bots!
Four Types Of Bots
Is There A Downside To Conversational Interfaces?
Are Bots just a Fad? Are GUIs really Superior?
How to Design a Bot Protocol
Breaking The Fourth Wall In Software
Bots Are The Anti-Apps
How Much NLP Do Bots Need?
Screens Are For Consumption, Not For Interaction

--

--