Birth of Ultron sequence from “Avengers: Age of Ultron” (source)

A Perspective on Conversational UIs and Chatbots

There is a lot of talk going on these days about conversational UIs, chatbots, conversational commerce, and the like. It has been something that has been “in the air” for a while and many smart people I know have been thinking about this space a lot and have been attracted to this problem. But when did it all start and what sparked it off?

To start, SMS is a piece of crap. It was built on top of the cell phone voice communications infrastructure to carry tiny bits of text. It’s incredibly slow, unreliable, and does not handle photos/videos very well. Plus the carriers charge people an arm and a leg, especially internationally.

So when Apple launched APNS (Apple Push Notifications System) in June of 2009, many developers jumped on it and started creating their own messaging apps, since it was a prime opportunity to make something significantly better than SMS.

The most notable in hindsight was WhatsApp, which was founded pretty much right after APNS was released. Kik Messenger was also founded that year. The summer of 2010 a couple group messaging companies, Beluga and GroupMe, were started and others followed. Then there were a cluster of new messaging apps that became popular in Asia that launched end of 2010 and early 2011, Line, Viber, and most notably WeChat. It may have been related to the release of the push notification API in Android with Froyo (release 8) in May 2010, but KakaoTalk in Korea launched almost a year before in March 2010, which may have lead the trend in Asia.

Many of the companies that were created in that year and a half from summer of 2009 to early 2011 have had a long-lasting impact. Beluga was bought by Facebook and became Facebook Messenger. WeChat is now a major force in China and everybody in the West wants to emulate it. But probably the most significant of all was WhatsApp. When Facebook bought it for $19B (or $22B depending on how you count it) in February 2014, it sent shock waves in the tech world. Not only was it an insane price, but it came totally out of left field, nobody saw it coming. It was also the only time I’ve seen in recent memory of somebody being offered a board seat as part of an acquisition and it was Facebook’s board of directors to say the least.

Everybody naturally had to figure out what the impact of this would be. I imagine it focused a lot of brain time of a lot of smart people specifically on messaging. My guess is that many of the smart people didn’t just go off and think, “hey I should build a messaging app”, though we did get some apps like Yo and Ethan, but the concept of messaging definitely stuck in their heads and so it became part of the ether and buzz going around Silicon Valley. I’m sure everybody wanted to be the next WhatsApp.

The next big moment came when Magic launched in February 2015. It was a side project that Mike Chen launched while he was part of the 2015 winter class of Y Combinator. The idea he got into YC for was a blood-pressure monitor called Bettir, but as the article in Wired mentioned:

Building their back-end system led Chen and his four co-founders — Ben Godlove, Nic Novak, Michael Rubin, and David Merriman — to wonder: how much can you really do over text? What if you could just say “I want a pepperoni pizza” or “I need the soonest possible reservation at Nobu,” and it would just happen? You wouldn’t have to worry about the how, the where, or the by whom — that would be somebody else’s problem. All you’d have to do is send a text.

It was a pondering, a question, that they acted upon and did a small simple test to see if anything was there. It took off like wild fire, got featured on ProductHunt, and eventually led to a $12M Series A from Sequoia at a pre-money of $40M. This was another shock to many people and they could not believe how much they raised and the valuation. But it was Sequoia, who also backed WhatsApp and Google and Amazon and many other legendary companies, so what were they missing?

A series of blog posts came out soon after, one from Jonathan Libov of Union Square Ventures, then Benedict Evans and Connie Chan at a16z. There were a couple who wrote pieces about this general trend right before, most notable was Ted Livingston of Kik who wrote about his experience with WeChat and Chris Messina with his “Conversational Commerce” piece. But again it was in the air, brewing for a while in the minds of all the Silicon Valley elite, since the WhatsApp acquisition had happened the year before.

Soon after Operator launched in April 2015, having been in stealth for about a year, then Facebook M launched in August 2015 and then it was off to the races. Facebook Messenger integrated Uber so you could just text message to get a car at the end of 2015, then at the beginning of 2016 WhatsApp removed the $1 subscription fee in favor of monetizing from businesses instead.

But the most interesting piece of recent note is from Sam Lessin. He wrote a great piece on “The Information” and then reposted on the blog for his company Fin. He wrote it from the perspective of a developer, tired of the world of apps, where it takes so much effort to make and launch, so hard to get noticed from all the other thousands that get launched everyday, the annoyances of the App Store and it’s closed system, and in the end people still just use the same 5 apps from Facebook or Google. The world where the interface is just chat was refreshing. It became more like the open world of the web, where you could just make changes on the server and didn’t need to push out a new client. You also didn’t need to worry about the super slick fancy physics user interfaces, which take an incredible amount of engineering time to build and polish. It’s just text.

I do think that frustration by the developer community on the current state of the app world is key to what will push the conversational interface and chat bots into the future. If you look at the bang for the buck on the effort you put into building an app, the margins are getting smaller and smaller. It’s too hard to get noticed, too slow to get things done, not enough green field for founders and entrepreneurs to find it enticing enough. Chat interfaces, chat bots, conversational commerce, or whatever you want to call it, is creating a new green field and it seems to be resonating with developers.

But to achieve that vision, you need an open messaging platform, just like HTTP was for the open web. The only one that really exists today is SMS, it’s on everybody’s phone and there are companies like Twilio that allow anybody to use it to send messages on it. And there in lies the rub, SMS is piece of crap, still is a piece of crap today. It is hard to imagine WhatsApp, WeChat, Facebook Messenger, or any other major messaging platform opening up as freely as SMS. There is so much at stake that they will have to be very careful on how they open it up to developers. WhatsApp has been especially cautious and have come down hard on developers who have hacked their way into their platform. Apple actually has a potential play here with iMessage, which is significantly better than SMS. If companies could use iMessage’s communications stream for fast delivery, view receipts, and support for photos and videos, I know many developers would jump on it in a heart beat. Several have tried to hack their way into it, using an exploit with the accessibility mode on iPhones, but Apple shut that down promptly. Google just recently announced support for a new messaging protocol (RCS) at the Mobile World Congress that will hopefully improve things, at least on the Android platform.

So for now, it maybe that SMS will have to be the answer, at least in the US. Many companies are already going that direction, like Assist and Digit, but it might be just enough to get the plane off the ground. Internationally, I think it’s a different story and it may end up being a closed system that wins out. But maybe Telegram will be a good contender, now that it has over 100M monthly users.

The other major issue is the business model. It may seem obvious with the “conversational commerce” moniker, but the problem is the mismatch of how much the merchant/vendor is currently making per transaction versus how much a customer is willing to pay via a chat interface. The gap is wider than you might think. And if you are doing retail, you are competing against low margin behemoths like Amazon. How much of a margin will your customers be able to accept for the added shopping concierge service? This is what many companies are trying to figure out right now and my guess is there is much to be learned from looking at how the on-demand businesses are faring today, basically it’s not easy to make it into a profitable business.

And this is where artificial intelligence comes in, if you replace humans with bots, then you have a chance of creating better margins. There are some companies that are going totally bot powered, others going totally human powered, and then there are those that are doing a hybrid of bot and humans, where the theory is that initially it maybe 90% human and 10% bot, but as the bot scripts get better and deep learning systems become more capable, it will become 10% human and 90% bot. This is another aspect of conversational UIs that is attractive to developers, all the smart people I know are incredibly interested in artificial intelligence and here was a practical problem that it could be used to solve.

All of these things have come together to create this fervor about chatbots and conversational UIs. It has gone a bit overboard lately and people are applying it to any problem they think it has even a remote possibility of solving, but that being said, I do think there is something interesting here and worth pursuing.

Only time will tell whether this latest trend is a fad or something more permanent, but either way I’ll be watching it very closely and trying out my own experiments to see where it all ends up.