The Rise Of The Social Robot

Published in

Social Robots

10 min readJun 11, 2016

“How Do You Do?”

I was deeply alone. My family was in the next room, laughing and screwing around with Legos or puzzles or something. I was standing in the middle of a rocky amphitheater, that looked like Monument Valley, Arizona. I really don’t know where I was, but it was at the edge of a sandstone dais, and something like Stonehenge was orbiting overhead.

The GearVR strapped to my noggin didn’t feel heavy at all. I was in Land’s End, having a blast, solving my own puzzles in virtual reality.

A small black figure rose up from one of the stones around me — a virtual ghost-baby — and then three more and then a black hole started growing, and I wondered, “What the hell’s going on?” I wanted to talk with these ghost-babies. I wanted get their help to sort out events, share the backstory with me. Who were they? What’s up with this black hole thing? Why are there boulders flying around overhead?

At that moment I felt something soft touch my knee. I shoved the headset back onto my forehead, looked down, and saw my kid. He’s a real, physical four-year old baby and by the look on his face he was thinking the same thing as me: “What the hell’s going on?” After all, I was standing alone in a room with a box on my face waving my head around.

Now, I know I’m not supposed to put a VR headset on a toddler, but he couldn’t figure out what was going on in there, either.

Land’s End, by UsTwo Games (photo by author)

I’ve got this emotional instinct to share things. It divides the pain and doubles the fun. I like to sort out simple questions and with my kids in particular. We investigate things like the mysteries of a bug’s job, the taste of tree bark, the language of birds. This stuff is fun partly because we’re social. Because we humans are social we make social things. Things like language. Or the phone. Or Facebook.

Or social robots.

Shake Your Chassis

Once upon a time I met with the guys that built the first androids (at AIST, in Japan) and asked them why the function was following the form (after all, a humanoid shape is a horrid functional design for any robot) and they said it was because it was a social system. It needed to look like a person so that people could socially engage. Sure, people have a hard time relating to a blob or a box, but I’m not sure a robot has to have a person-shaped face, never mind a person-shaped body. People socialize with cats and dogs all the time, so building an entire human-shaped body’s sorta gratuitous. It seems to me that most social robots are lugs first and socialites second. Famous robots like NAO, Asimo, or Pepper are marketed as social, but if we were to measure the amount of hours that went into building the parts of the robot that are social and weigh those hours against the mechanical componentry I’d bet the scales would tell the truth. Physical robots have all this stuff like the sprockets, stepper motors, mounts, pans, circuit boards, bearings, and casings … none of that physical stuff is social.

Androids are supposed to be social robots, but I’m not so sure sculpting a computer case to look like a head is the right approach. When it comes to social interaction physical bodies just aren’t that important. There’s advanced forms of social interaction (like sex), when a body really comes in handy. And for people-to-people interaction we definitely do better in person.

However there are some really smart, inventive people that are building physical social robots. People like Cory Kidd, or David Hanson, or Trevor Blackwell, or Alexander Reben, or Andra Keay, people that are making well-designed and very social robots built to help or whatever. These folks are convinced that the physical element has to be there. The body isn’t obsolete for them. But the priorities of human-to-human interaction are different for human-to-robot interaction. In the end, a smile isn’t important because it’s a mouth. It’s important because it transmits social data. Sure, “meetings are golden,” sure it’s nice to “look someone in the eye” and great to “press the flesh.” Yes, it’s hard to maintain a long-distance relationship, for example, but my relationship with my wife is different than my relationship with my robot. They’re different things.

The debate will burn for decades with some of us arguing that “we should build humanlike robots” and others of us replying that, “there is no point making robots look and act like humans.” And then there’s the uncanny valley.

Just as the soul lives in our wetware, when it comes to robots the social lives in the software. The physical form is just there to convey more information. The chassis, or mouth, or butt, or whatever is just a support for the semiotic cue.

It may as well be virtual.

Shake Your Bot

Bots are social. We’re talking about ChatBots (not things like BotNets — and if you need to know what a ChatBot is go read another article). Bots are the most social software ever engineered. They’re social by design since most of them are simply text interfaces. We can therefor say the form is following the function. This is a better way to design something — it’s user centered design (instead of interaction centered design) and because dialogue is one of the key modes of interaction that bots have — and because dialogue is a pretty complicated thing — the design of bots has taken a comparatively gradual ramp up compared to something like webpages. Webpages and bots both use language, but dialogue is more social because its interactive.

So, bots are social, their form is following their function, and they’re a start of a significant technological trend (and probably more significant than androids for these reasons).

Bots are going to become increasingly social as API aggregation and micro-services that bind to these interfaces allow them to recognize our words, our faces, where we are, what we want, what we said, what we meant, and why what we said and what we meant were different. The multiplexing of services like Amazon Echo ASK, Nuance Mix, Microsoft’s Cortana Suite, Wit.ai, or Viv, will be just the start. Multiplexing multiple NLP services like ChatScript, AIML, Facebook’s platform, and others is already happening. We’ll soon multiplex multiple voice recognition services to increase confidence rankings, rank multiple affect services for redundant data, use Bayesian models to self-edit and even author dynamic knowledge bases, self-learn from experience, adapt to a particular topic of the conversation, and generally get wicked smart. Soon the social intelligence of bots and the ability to read people will exceed what most of us can do today. And this will happen in under a decade.

But techno hooray aside we have to place people over technology because ultimately, bots are psychological tools.

I built my first bot for Oracle in 1999. It was supposed to be a customer relationship management system — a user was able to type questions to it via a webpage and it would type back answers so that the poor Bangladeshis who were constantly replying to the same questions in this manner for OracleSalesOnline 24 hours a day wouldn’t have to go to work. The irony of the situation aside, the bot was just a FAQ. But it was still social as it engaged in interactive dialogue, and it managed to both please and piss people off. Those of us working on it (it was built in AIML, I think) were proud of it but Oracle shit-canned the project because users were accustomed to interacting with another person and they felt cheated when they found out they were texting with software.

There were two lessons my first bot taught me about how to build a good bot. 1) The user needs to know who they’re dealing with, and 2) emotions are the interaction.

These lessons still guide me and still seem valid because if I look at the most intense social interactions (lovers, politicians, friends, family) these two factors are proportionally key to the intensity of their relationship. In other words, the more someone knows who they’re dealing with, and the more emotions are a part of the interaction, the more intense the relationship.

We can stroll along this path and get to a vantage point. I’m willing to predict that people will fall in love with bots when people know the bot, when the bot reflects knowledge of them, and when the bot is able to accommodate emotional interaction.

But why are we programming social systems at all? Why make robots or avatars social in the first place?

The answer is, simply, Power.

The Effect of Affect

The field of Affective Computing is slowly rebooting human-computer interface design, largely because it is so effective.

The way affective interaction generally works is by capturing the input of a person’s emotional cues and generating an output that appears appropriately emotional.

The input from the human can be tricky to capture, at least today. The way it works is that the appearance, sounds, and words a person makes are mapped to databases that rank emotion variables. So, if someone is saying the word “happy’ and looking happy, and making happy sounds, then you can pretty confidently measure them as being happy. Same with grumpy, or stressed, or calm, or whatever. That’s the input. It requires processing large libraries (hence cloud-based SaaS). Microsoft Cognitive Services offers hosts of these things. So do other providers like EmoVu, Imotions, Receptivi, IBM Watson, Kairos, Affectiva (recently acquired by Apple), nVisio, or scads of others. Its a dark art of human interaction despite the bright math of the various APIs out there.

The output from the system is a little easier to manage, at least today. Similar to the input, the words, appearances, and sounds of the system are related to emotional semiotics. So an avatar or robot might say “happy” in a high-pitched tone and smile. We drive our avatar animations with ACTR, which takes the affect of the words used and the duration of the words of the AI output and moves the character face or hands or whatever accordingly. We can use SSML to change how the character’s words sound: to be warmer or colder, happier or more sad. We work with poets and writers to craft those words according to personality archetypes, and drive the stories just as a screenplay would.

The difficulty using affective computing models is that different people have different emotions (and different numbers of emotions). Emotions, seem to be influenced by culture, and emotions are also these ambiguous, gooey, dynamic things that are housed in moods and personality and that means they’re pretty damn hard to understand, let alone concatenate, process, and contextually reflect back. At least today.

There’s this great line in the science fiction movie A.I., (that Spielberg film from 2006), when Gigolo Joe says, “She loves what you do for her, as my customers love what it is I do for them. But she does not love you, David.” We may note that the robot was based on a “user centered” design.

This is the feeble heart of affective computing’s power: it is user centered design. She does not love David the robot boy. She loves what he does for her. Similarly affective computing is not truly social, but it only appears to be social. It is not truly emotional, but appears such. That is, after all, the definition of the word “affect.” Affective computing — and therefore a social bot — is an endocrinological mirror that is held up to the user so that they feel what the holder of the mirror wants them to feel.

And this is where the power of affective computing lies: emotional control over the user.

Why are we writing code so that social robots or social avatars display empathy? To control the user. It’s like writing a love letter; it appears empathetic, generous, gorgeous (and it may well be), and it is an effort to control. Writing the love letter tries to control the beloved. Writing the bot tries to control the user. It’s about power.

This is why bots have ethical implications.

This is nothing new, nor is it necessarily insidious. Directors shoot movies that make people cry. Authors write books that make people laugh. Sculptors, painters, writers, musicians, thespians, and now programmers rely on these emotional effects because it is affective.

Affect is Virtual

Pretty soon, VR spaces are going to be populated by an indigenous species of software robots, or graphical bots, or virtual people, or autonomous avatars, or something catchy (they’ll have a proper name soon). These bots will live in environments like Land’s End, they will be as beautiful, mysterious, and as well-designed as Land’s End.

These graphical bots will be nothing like the uncanny zoos of zombies we avoid today, but angels, devils, truly fantastic seraphim that are interesting, engaging, and know us. They will be crafted like love letters. They will use affect libraries. They will know us as individuals, we will know them as bots, and the emotional valence of the interaction will cause these highly social robots to be lovable. Quite lovable. More than many people in our lives. We’ll want to talk with them as much as we’ll want to spend time in VR.

This is the rise of the social robot. And they will exist in virtual worlds because, simply, they’re virtual people.

When I asked my four-year old son who the figures were in Land’s End he said, “They’re robot people.”

When I asked what they would say, his reply was “How do you do?”

— — — —

Mark Stephen Meadows is an American author, inventor, and designer. With 20 years in VR, 15 in NLP/AI, and 5 in blockchain he has designed and developed artificial intelligence applications at some of the world’s top research labs (Xerox-PARC, SRI, The Waag Society, and others). He has worked as a government-level consultant in both hemispheres, is the author of a half-dozen patents, and has written four books that examine technology and its social consequences. He is President of Botanic.io, where he leads the vision of the company by inventing new methods of computer-human interaction, designing the hearts and minds of highly social avatars and graphical bots. Follow him on Twitter @meadovian.

The Rise Of The Social Robot

Written by Mark Stephen Meadows