CODEX
How I’m using IBM Watson to make my Humanoid Robot become a Cognitive and Emphatic Assistant.
Tonight, inspired by the second viewing of Terminator 2 (I love that movie and its OST) I finally finished the second phase of my CognitiveBox project… Well.. this article going to be a little bit long…
I’ve been working on it every day after work until late…(my girlfriend is so patient) putting all the pieces together: the hardware (the raspberry works very smooth) and the software (mostly it’s cloud software on IBM Watson Platform) and some scripts…
My initial idea was using Artificial Intelligence or as I prefer to say Cognitive Computing to build a Virtual Assistant able to listen, comprehend, and answers a wide range of questions but every day I’ve been working on it my expectations were growing and I’ve been adding some features and requirements to this virtual assistant until today.. when I consider that I’ve reached the first milestone… but the road is long!!
In the name of Watson
First of all, I started to think about the name of my creature. Since Jarvis was already taken I decided on a cryptic and alphanumeric name but full of significance for me; so I called it TJ-800: basically it was named after T.J. Watson from IBM and T-800 (the good cyborg interpreted by Arnold in Terminator 2 — The Judgment Day).
Building a personality for my robot
The second thing, I would need to define a personality for my Assistant. I’ve been developing chatbots with IBM Watson and basically, the heart and the brain of my virtual assistant will be a multipurpose chatbot. But I wanted to define clearly the personality of my robot... I wanted it to answer questions in a certain way, using a certain tone and as more as possible with a great sense of humor and empathy.
All the chatbots I’ve been developing use Watson Tone Analyzer to understand the tone of the conversations they are are engaging and TJ-800 of course will be able to analyze it in real-time and react in case of angry, joy, or other sentiments.
React-based on sentiments means empathy and empathy is the development goal of the modern chatbots and I want to work on it with TJ-800 during the next months.
Regarding the personality, I decided that TJ-800 will answers my question in a very similar way I could answer. It will have as much similar as possible my sense of humor and sarcasm… and to create a digital copy of my Social-Personality I decided to gets the data to build my Cognitive baseline, of course, mining my Social Media profiles.
What better dataset can I use to represent a digital copy of my life than social media which has been storing my memories and my thoughts during the last years? So I’ve started with Facebook and Whatsapp.
Gathering my Social Data
Facebook and Whatsapp they give me the option of downloading the whole set of messages in the last years. I only had to convert them into a format usable by Watson (Intents, Entities, and Dialogs). It’s a work in progress.. and I will take a lot of time to complete this task.
Whatsapp’s messages downloading was a little bit more specific: I had the option to select singularly any of my contact’s archive to be used as a source of conversations and the decision was quite easy: I choose my girlfriend’s messages for two reasons: they are mostly in English (with some Polish, of course) and the dataset is the most relevant with a whole set of expressions and situations. No personal data will be exposed, of course… in case one day I decide to open this chat for other people… but I’ve just hope my chatbot will not become too romantic :-)
Actually, I have something like 7.536 answers-questions from a whole set of 13.000 excel rows to be used as baseline dialogs from my conversation with my girlfriend, and very soon I will start to load it into the cognitive engine of the chatbot. A dataset with 7536 responses is not enough to provide a stimulating and dynamic copy of my Social-Personality but it’s a good starting point.
The Facebook archive is huge… as you can imagine…I’m very interactive on Facebook and I have more than 120.000 excel rows to be parsed and cleaned before it can be used in my chatbot.
To manage my Social-Personality dataset from Facebook, I created an excel file and now I’m dividing it by language (in the last years I’ve been chatting in Portuguese, English, Italian and Polish), pairing all messages in the format question-answer and removing the duplicates. It’s going to be a long work!!
Giving a body to TJ-800
Now that the brain and the heart of TJ-800 were defined, was time to think about how could it should look. There were thousands of options, several types of robots, kits do-it-yourself (DIY), and other stuff and after deep research, I decided on the Robosapiens by WowWee.
This amazing robot has advanced biomorphic characteristics and very dynamic and fluid motions and gestures, 67 pre-programmed functions, and is fully programmable and commanded via Infrared. All I had to do was to identify how to use the included Infrared dongle to send the right signals to the robot during the chatbot conversations.
Basically, the Robosapiens communicates through its IR Remote and it has a dongle to connect it to my smartphone using its app. I found out that there’s a way to control the robot by playing WAV files. I saw it on the WowWeeLabs GitHub page, so I downloaded the WAV files and started controlling the robot connecting the IR Dongle to my CognitiveBox audio jack! Eureka!!!!!
Going through the technical side!!
Technically, my TJ-800 is a mix of Watson Services like Conversation, Speech-to-Text, Tone Analyzer, and Text-to-Speech, Watson IoT, and some .js scripts; everything wired up by Node-RED. To put everything together I studied a series of nice articles that I found on GitHub and Medium and adapted them to the project.
The most critical part and I am still working on that to improve it… is to control and synchronize the movements of the robot with the dialogs but injecting the right audio file to be played after any Watson text-to-speech answer was the easiest way to synchronize the TJ-800 body with its answers. You can see the first results here in this video:
Lessons learned in this first phase of development
1. The IR Sensor must be pointed to RoboSapien and it reduces a little bit the practicality of this solution.. by now there is no other way to control the robot.. (the Bluetooth version of this robot cost a lot of money and I did not want to spend so much for that).
2. Sometimes the Watson services take a lot of time to answer on the raspberry ( CognitiveBox) and it can be on timeout. Maybe it’s a latency problem or maybe it depends on the Raspberry… I will find an answer to this.
As Phil Gilbert says: Everything is a prototype and it is still a work in progress. A lot of improvements can be done but I am really happy with the first results. It’s really promising.
I am having a lot of fun trying to challenge myself with Cognitive Computing. If you like it and want to check my next steps in this project… follow me and we will keep in touch.
Thanks
Jair Ribeiro
Wave Project Manager for Datacenter Migrations | Design Thinker | Cognitive Learning Leader at IBM
Originally published at https://www.linkedin.com on September 8, 2017.