A “Dialogue” on the recent advances in Conversational Artificial Intelligence (AI)

How important is it to interact, converse and emote in a world that is getting closed and parochial? Conversational Artificial Intelligence (AI) offers a leeway to build agents that have the capability to learn and respond like humans and thereby align in bringing the long term goal of General AI to fruition.

Conversation with artificial assistants, be it Microsoft’s Cortana, Apple’s Siri, Google Now or Amazon’s Alexa is gaining prominence in the last few years. So lay back, relax and enjoy the simple conversational interface at offer, as I take you through a short tour!

In this blog, I cover the latest developments in the field of Dialogue and conversational Artificial Intelligence (AI). I give a brief overview of the current developments from this field, the many Language Understanding tools in the market and in particular, review one of them — IBM Conversation.

It’s a rat race — So act and don’t over think!

After the horrors of Tay tweets -Microsoft’s conversational AI tweet bot that was eventually rolled back due to its racist and sexist tweets early this year, AI enthusiasts have had some good news over the last few months.

Microsoft hurried the launch of Tay tweets, its conversational AI bot only to shun it completely.

The Amazon Echo, Google’s Home and the smart home hub Apple has been preparing are good examples of how big companies are fighting tooth and nail to secure a place on your smart space. Here’s what Francis Chollet, researcher at Google and author of the popular framework — Keras has to say,

Whatever idea you started working on last week, a few other teams have probably been working on it for a month and are about to publish.
— François Chollet (@fchollet) October 5, 2016

Alexa Prize Competition

Just 2 weeks back, Amazon announced the Alexa Prize, an annual competition for university students dedicated to accelerating the field of conversational AI. This inaugural competition focuses on creating a social bot, using the Alexa Skills Kit (ASK) to converse coherently and engaging with humans on popular topics and news events. This gives student developer teams to explore a plethora of advanced topics in the realm of AI that include knowledge acquisition, natural language understanding, natural language generation, context modeling, commonsense reasoning and dialog planning. With a huge cash prize at stake, goodies at offer and support from the ASK team it would be worth an experience to build a socially coherent bot! The last date of team submissions is October 28, 2016 and more details about the application process can be found here.

Say Allo!

Google Allo, a smart messaging app that has personalized recommendations with the Google Assistant to express yourself better with stickers, doodles, and HUGE emojis & text. Allo also allows you to get help from your Google Assistant without leaving the conversation. A one to one conversation can be initiated with your Assistant which gets better as you use it more by addressing it with the @google tag. More functional details on the blog Say hello to Google Allo: a smarter messaging app

IBM Pepper developer Conference

The IBM BusinessConnect 2016 on 4th October 2016 in Stockholm, Sweden showcased some of IBM Watson powered tools, and applications in humanoid robot of Pepper.

Yesterdays #IBMBCSE at Stockholm Waterfront was fantastic thanks to all IBMers, partners and customers, and thanks to #Pepper of course! pic.twitter.com/quZuaptu8Z
— IBM ClientCtr Nordic (@IBMCCNordic) October 5, 2016

IBM’s Pepper is powered by SoftBank robot and uses IBM Watson technology at its core.

Banzai! (Live long) — Watch this first home robot commercial as the unforeseen future is coming!

The Watson Developer Conference is packed with technical talks, hands-on labs, and coding challenges to get you working with the tools that will make you a sought after developer and is going to be held in San Francisco from 9th to 10th November this year.

The IBM Global Industry Solution is located in Nice, France.

Joie de vivre — Samsung buys Viv

And after Google’s Allo and IBM’s pepper it was Samsung to jump into the Dialogue based conversational AI bandwagon as it acquired Viv, creators of Apple’s Siri. Viv is a more powerful version to Siri that brings in ubiquity. With its self-generating software that is capable of writing its own code to accomplish new tasks and by dynamic program generation, Viv handles new user tasks and build plans on the fly!

In its demo video on “Beyond Siri: The World Premiere of Viv with Dag Kittlaus” (as in the embedded link/video below) earlier this year, Viv was eventually be partnered or sold to a mobile device.

With everyone wanting to invest heavily, the question was who and when! Hence, this announcement from Samsung doesn’t come as a big surprise.

Viv will ultimately provide services to Samsung and its platforms but remain an independent entity. Samsung hopes to disrupt the mobile market share with this acquisition. It can extend it to other home devices, after all it had purchased SmartThings for around $200M back in 2014. More details on the acquisition here: Samsung acquires Viv, a next-gen AI assistant built by the creators of Apple’s Siri

Don’t take it slow because there is Ozlo !

Ozlo launched few days back on iOS and the web is another of the many sprouting AI assistants which uses good memory of one’s previous interactions. Ozlo, at least by its name attempts to be different than all assistants of its competitors in the market at present that use repetitive female names. The best thing is that it is integrated with a plethora of services like Yelp, TripAdvisor,IMDB, among many others and use Further Food, Authority Nutrition, Cookies, etc. to provide nutritional guidance. This is a huge boost than all of its rival companies which tend to prioritize their own services rather than integrating with existing services. An in-depth review can be found here: Ozlo AI assistant is the new underdog filling the void left by Viv

And there were rumors that Apple is going to buy McLaren, which set the eyeballs rolling as a big tech giant was entering a completely new domain of automobile industry and would lead others like Google, Microsoft and IBM to follow suit and invest heavily!

Conference workshops also wanting a dialogue!

There are in total 50 workshops at NIPS 2016 this year covering a range of different Machine Learning topics.

  1. The Dialog workshop, scheduled on the 10th of December focuses on building agents capable of mutually coordinating with humans via communication. And given the tremendous economic potential of the ability to converse intimately transcends to the overall goal of AI.
    For the call for papers, the deadline is extended to the midnight of October 23, 2016 and more details about the workshop schedule can be found at the chair website LET’S DISCUSS: LEARNING METHODS FOR DIALOGUE NIPS 2016 WORKSHOP The papers are on the below three high-level areas
  • Being data-driven especially the offline/online evaluation
  • Build complete applications or end-to-end systems
  • Model innovation to incorporate linguistic knowledge into the architecture

Another workshop on Interactive machine learning (IML) is to be held on the 9th of December. It focuses on the adaptable collaboration of how autonomous agents solve a task by making use of interactions with humans. Designing and engineering fully autonomous agents is a difficult and there is a compelling need for IML algorithms that enable artificial and human agents to collaborate and solve independent or shared goals.
The call for papers explores new ideas in interactive learning, reports on research in progress as well as discussions of open problems and challenges facing interactive machine learning with particular interest in the research on the practical application of interactive learning systems (for robotics, virtual agents, dialog systems, among others), and the ability of these systems to handle the complexity of real world problems. More details about the application process, requirements, application deadline, etc. is at the workshop portal Future of Interactive Learning Machines Workshop (FILM at NIPS 2016)

A review of Language Understanding tools — IBM Conversation

IBM Conversation within Bluemix

IBM Conversation was built on the lines of IBM Watson from the IBM Bluemix suite. It is now the for dialogue construction after IBM Dialog was deprecated.We start off by searching and then creating a dedicated environment in the console.

Setting up IBM Conversation from the Bluemix Catalog/Console

Basics

Conversation component in IBM Bluemix is based on the Intent, Entity and Dialogue architecture. And the same is the case with Microsoft LUIS (LUIS stands for Language Understanding Intelligent Service). One of the key components involves doing what is termed as Natural Language Understanding or NLU for short. It extracts words from a textual sentence to understand the grammar dependencies to construct high level semantic information that identifies the underlying intent and entity in the given utterance. It returns a confidence measure i.e. the top-most extracted intent out of the many pre-specified intents that gives us the most likely intent from the given utterance as per our trained model.

These are all statistically/machine learned based on the training data. Go over the demo, tutorial and documentation to get a more in-depth view of things at IBM Conversation.

The intent, entity and dialogue based architecture forms the crux of any SLU system to extract semantic information from speech and enables such a system to be generic across the various Language Understanding toolkits.

The Alexa Interaction model based on intent and slots in ASK

Another huge advantage that ASK provides for building such an architecture, is that it has multi-lingual support.

Conceptual Mapping

Intents can be thought of as classes where one classifies the input examples into one of them. For example,

Call Mark is mapped to the MOBILE class and Navigate to Munich is mapped to the ROUTE class

The entities are labels, so e.g. from above, you can have
Mark as a PERSON and Munich as a CITY.

Major advantage and drawback

Both Conversation and LUIS use a non-Machine Learning based approach for software developers or business users to create a fast prototype. It is definitely easy to begin with and gives a lot of options to create drag and drop based dialogue system. However, it can’t scale up to large data. A hybrid approach that can combine or build a dynamic system on top of this static approach is needed for scalable industry solutions.

Extensions

Moreover, an end to end workflow can be built by plugging in components from Node-RED and introduction to the same can be viewed in the below video.

What’s good is that they have a component for Conversation as well. So, we can build a complete chatbot starting from a speech to text component to get the human commands translated to text, followed by a conversation component to build up the dialog and lastly by a text to speech component to translate this textual dialogue back to speech to be spoken by a humanoid or a mobile device!

Missing components and possible future work

It is not possible to add entities/intent dynamically through the UI after the initial workspace is constructed. The advanced response tab doesn’t allow to edit (add) the entities in the response field, like for example adding variables to the context. We can edit it (highlighted in orange) but it doesn’t save or get reflected.

{

“output”: {

“text”: “I understand you want me to turn on something. You can say turn on the wipers or switch on the lights.”

},

“context”: {

“toppings”: “<? context.toppings.append( ‘onions’ ) ?>”

},

“entities”: {

“appliance”: “<? entities.appliance.append( ‘mobile’ ) ?>”

}

}

Moreover, the link which only mentions accessing intents and entities but not modifying them.

The only place to add the intent, entities is back in the work space and not programmaticaly at run time. Perhaps, a possible solution can be to use UI with DB data to save the intermediate and newly discovered intent/entity values and then update the workspace later.

As I end this blog, perhaps there would be another AI assistant released that has moved beyond its embryonic stage to conquer real life application scenarios. Conversational AI is hot property, so dive in to reap its benefits, both from an end user and developer’s perspective!

Note: Hope you enjoyed the read. I have deliberately kept the content a mix of non technical and technical to build the excitement and buzz going around this exciting field of conversational AI! Publishing this blog was on my list as I was compiling lot of facts since last few weeks but I had to hurry even more, given the recent news surrounding this upsurge. As always, any feedback as a comment below or through a message are more than welcome!