The “chat” in “chat bot” is a distraction

Chat bots are all the rage in our little technology industry echo-chamber. It’s easy to get distracted by messaging applications and services right now. WeChat, WhatsApp, Facebook Messenger, Slack, etc. are starting to dominate our online lives both at home and at work. WeChat has turned itself into a little AOL-like micro-internet, with conversation as the central metaphor. It’s the end of the App era, so they say.

It’s only natural, therefore, that we start the usual platform gold rush. Companies are clamoring to become the platform providers, infrastructural glue, and app providers for our conversations. Judging by in-person conversations, a quick trip to Silicon Valley, and the contents of my email inbox, I’d guess that multiple new startups are forming every week in this space.

Everyone’s excited about these bots, but if you look at what they actually do, you’ll usually find two things: an adapter allowing developers to talk to multiple proprietary chat platforms, and a natural language interface. That’s right, they focus on the “chat” in “chat bot”.

But “chat” is only the superficial part of what’s changing. It’s a lighter, potentially more ubiquitous user interface. This is a big deal. But it’s not the biggest part of the big deal.

The more impactful part of “chat bot” is “bot”. In this context, “bot” (short for robot) implies both automation and intelligence. Neither of these is specific to a conversational interface. In fact, the automation part is best done with no user interface most of the time.

Conversation provides bots with context. Sometimes the context is direct and explicit: you ask the bot a question and it answers. Other times (ideally most?), the context is implicit. A bot is added to a conversation and learns as it listens. If it discovers something it thinks it can help with, it interjects. “It sounds like you’re trying to plan a time to meet. I can access your calendars. Should I set something up for you?”

That’s an example of both intelligence and automation which just happens to use chat as both the context and the user interface. The most interest part of that example interaction, though, was that the bot was providing ambient computation. Nobody clicked a button or said “hey bot, set up a meeting”. It just listened, learned from context, and offered to help.

Think of all the context available to an authorized computing system these days. This is what’s new. For the first time in history, a computer system could theoretically have access to almost any relevant piece of information about you, your context, and the context of those around you in real time, and possess cheap enough processing power to reasonably do something helpful with that information on a mass market level. Additionally, we have increasingly wide-spread expertise in how to apply mathematics to an ever-increasing set of problems. Machine learning is on the verge of being recipe-driven for a subset of problems, allowing hackers (non-scientists/mathematicians) to hack together seemingly magical systems.

From entertainment preferences and habits to always-on geo-location data, systems in 2016 can trivially access a wealth of both real-time and pre-processed data about us and our surroundings. Your conversations, photographs, email, chats, and calendars are all accessible. You wear Apple Watches, Fitbits, and Microsoft Bands which track your heart rate, sleep, and physical activity. You play games online, and computer systems track your history and performance. Systems know increasingly more about your real world purchases, from groceries to music equipment. Systems increasingly know about your travel habits; we’ve gone from using the internet to book flights to using the internet to order cars on demand and soon to have those cars drive us autonomously. Systems manage and track our thermostats, home lighting, and music.

What’s more, the connections between all of these real-time inputs form even more powerful possible sources of data. For example, you watch a certain type of TV show, order a certain type of food, and you don’t sleep well. You perform badly at games when you don’t sleep well, implying you perform badly at work as well. A system with enough ongoing data about you can predict chains of effects you might be able to anticipate. The correlations between disparate data sources combined with the ability to aggregate across many people, gives us data points that would have been impossible previously.

Even if you don’t post and track stuff, your friends do. Systems know about you. And they’re only going to learn more and more over time.


  • Conversation
  • Application Pop-ups
  • Push notifications
  • Email
  • Direct data manipulation (taking action for the user, such as scheduling a meeting)
  • Interaction with household and wearable devices

The key to interacting with an ambient agent is convenience. If you’re driving when the agent has something to say, voice communication is the safest and most convenient interface. If you’re at a desktop computer working, a graphical popup with traditional controls might be more convenient. It all depends on context.


Fortunately, researchers are working on the difficult problem from various angles. And the problem(s) is difficult. As Microsoft’s Distinguished Scientist, Cynthia Dwork, says “De-identified data is neither de-identified nor data”, which I extrapolate to mean “the common techniques for attempting to ensure privacy in data mining both make the data worthless and fail to ensure privacy”.

Thanks to smart people like Cynthia, I think we’ll solve these problems eventually. I also believe the industry will go ahead with invasive applications of machine learning and ambient computing in the mean time and that consumers will accept the trade-off given the magical capabilities these systems can create.

I help startups succeed. CTO, speaker, author, investor — currently @microsoft @blueyard

I help startups succeed. CTO, speaker, author, investor — currently @microsoft @blueyard