A Future for Phones + AI as Emotive Sidekicks

Marisa Lu
Emoto
Published in
8 min readApr 24, 2018

Documentation of the first ‘Emoto’ prototype. Collaborators: Lucas Ochoa and Gautam Bose

<iframe src=”https://editor.p5js.org/lumar/embed/r1azSGU2X"></iframe>

What if we reframed how we relate to phones and the OS? Home assistants? Can we better ‘place-make’ for phones in the home? What might AI’s with expressive movement for communication bring to the relationship?

First prototype was to create a speculative fiction people could experience through a WOZ driven live demo at ‘Where are the Humans in AI’ Exhibition, Spring 2018

The first prototype came together in E-studio 2018 Spring Class under Professor Dan Lockton with feedback and critique from peers: Soonho Kwon, Cameron Burgess, Aisha Dev, Anna Gusman, Emma Brennan, Maayan Albert, Monica Huang, Helen Wu, Jessica Nip.

Initial sketches drawn as a first reaction to Teenage Engineering’s Raven robot for Baidu’s AI

A conversation about business models, cultural associations and social norms surrounding IPA’s and a speculative design to turn phones into whimsical AI sidekicks to, amongst other things, disassociate from typical bias and begin transitioning a society away from mobile first (aka glued to tiny internet portals) to tangible AI’s first and perhaps ultimately towards fuller, undistracted engagement and presence in the present.

To the left: Baidu’s Raven by Teenage Engineering 2018 |||| To the right: A personal technical audit for a physically emotive home assistant 2017

Something all of these systems seem to have in common is that they fail to consider (with good reason of course) that we already have an AI in our pockets — our phones —

Why would we be interested in making our phones AI sidekicks? Because re-contextualizing where, when and how we interact with our phones and place-making for it more definitively in an environment may be a step towards ungluing ourselves from screens and drive us towards more intentional interaction and personable dynamic.

There’s a lot to unpack about the field of home assistants, the technology, interactions, and social associations, as well as our practices surrounding phones, long term implications, and more specifically this design in terms of emotive qualities, consequences, and how to build up a design framework with sensitivity and nuance.

Interaction:

  • Meant to be selectively unobtrusive
  • Selective animated visual feedback
  • voice driven and gesturally/motion driven — if it doesn’t need to be said, don’t say it
  • Multi-line command and react interactions
  • search/query/and everything you would do with your phone, clock, lamp, etc
  • the machine’s learning process/onboarding and its expected mistakes is far more enjoyable and users are more forgiving
  • adaptive learned personality from users facial feedback
  • unique and catered experience to each user in household
  • wider range of emotive expression without crossing uncanny valley

PROTOTYPING THE ANIMATION

Digital with Unity

… it got wonky sometimes. But otherwise Unity is an efficient method because it can deploy to mobile, web and desktop (seemed faster than Three.js) . We used Baidu’s Raven form factor first so that we could continue working on finessing our physical design in parallel.
Used unity remote so that we could animate the lamp using many fingers and hands to control all the servo sliders. Toggling the record option stored the HingeJoint component angles at each frame. Later on it revealed a lot of problems in calculating the actual angles because of chain dependencies.

Physical:

Feed back, critique, inspiration and challenges:

“Multiple contexts at once — would the parent trust the robot to trust the kid. Or in a friend setting, how does it work in multiperson social contexts” — emma

“Reminds me of this movie scene….the boy’s dad always sent him rocks as signs of how he is?feeling — would be excited to see the morphing form speak like body language” — emma

“what is the physical context and …..uh potential timeline behind our monolith abstract AI’s?” — Cameron

“what are the service models behind the designs” — Gautam

“Monolithic AI is sedentary — it’s more robust because it’s not being translated across different contexts — and they work with the current limitations of technology. Minimizing change in context minimize failure.” — Anna

“I’ve also never seen people carry around a google home”

“I would reassess why these current AI’s would feel human, and it’s tied to the voice — …..Eve feels human because she has a voice. Walle doesn’t feel as human cuz he makes sounds…” — Maayan

But Wall-E has communicated alot through out the movie in just sound effects and motion

“Think more deeply on the service models — how many services would it take on or how modular would it be? Because as services change, the form can still stay relevant because it isn’t tied anywhere” — Helen

“Take on the emotive/motion of a different emotive species, not necesarily human” — Aisha

“Roly poly…projects that would enable physical presence in long distance relationships”

Voodoo dolls

Michal callow? Robots as sidekikc Ph.D.

“How would you feel if your 3 year old had an assistant?” — Aisha

So bringing in the schema/brand of a sidekick has a whole different set of connotations…there’s to some degree still some of the same hierarchy but….the intelligent sidekick and the hero as figure head?”

There is so much variance on what/who is family (household servants that people grow up with?)

“King of Versaille has two nannies…they aren’t really thought of, are expected to be on call 24/7”

Google and Facebook running advertisements

You don’t buy a friend. But you can pay for an assistant

“kids with alexa and google are ruder to their friends. They ask the questions the way they address the AI assistant.”— Gautam

Dan….a writer that worries about his daughter no longer saying thank you

“we don’t treat people like google search bars” — Soonho

“draw the dragon eye last” — Helen

“Alexa has the blue ring that indicates it knows where you are and that she is listening

Accelerometer and gyrometer driven visuals:

Kicked off by exploring a breadth of representations for AI’s, computers, robots, from the abstract, to the literal and anthropomorphic.

From abstract ‘neural nets’ , to literal insides of a phone to Disney animated Wall-E robots

Eventually going ahead and prototyping some basic interactions driven by the accelerometer in the phone

Feedback session:

Opening questions:

  • Level of fidelity for compelling communication?
  • Color psychology?
  • Universal emotional associations to iconography/shapes?
  • Sentiment analysis from Google + intonation?
  • Sounds/animation inspo?(generative…?)
  • Demo?
  • Modular animation framework?
  • how to frame initially for the public?

Anna: do the emotional properties of the AI make me feel more emotionally hjealthy? How does this extend into the person’s life?

Feedback: synthetic sound to go with current aesthetic

keep going.

what does emotion add to the equation

Emma: why is google framed as a servant? Why does it need to be emotional? And what would it look like in an public setting or larger community?

We encoding emotion for what moral value? Social movement?

For kids, is this an onboarding AI?

Cameron: Eyes — what does that say — it suggests it can see as much as you do — eyes mean a lot and imply a lot of ability.

Is this even as smart as a creature? What does it imply when you make it like a creature?

Global implication of expectations vs. project scope. What do we want someone to get from this? Do people want this more than expressionless speakers? Is this a provocation?

Monica: phone is with you at all times outside as a phone, but in the phone, are you making the entire home your phone? So we just aren’t glued to to a screen

Aisha: ambient relationships — over time — growth over time (AI training) — think Faith’s fish

Anna: physical contact with it’s environment. What can this do? When does the physical motion become helpful?

Dan: use this as an emotional research probe

Another iteration

Video passwords: iteration

This iteration transitioned the previous spring system over to a more anthropomorphic interpretation reminiscent of Wall-E characters only our system would have animations reactive to sensor input. (We discovered in a comparative analysis after the fact, a very similar to project of a pixar animated toy robot ‘Cozmo’)

More attention paid to the sensors available on the phone to bring real world physics into the digital display
The animation is reflective of real time physical motion…but only for more powerful phones. LOL, such lag on the poor little 2015 Nexus. Code needs optimizing, or switching to a different renderer. WebGL? It actually might be an android problem though.

In progress glamour shots:

Having trouble deciding on color and composition. Will decide depending on how, when and where the photo is shown.

Prototyping Animations, again:

We’re ditching Unity.

…and going back to more physically intuitive methods like hand puppetry. The attached protractors help to get more accurate estimates of angles for hand manipulated keyframe animation.

We’ve switched over from trying to digitally generate a hardcoded library of moves to a smaller set of keyframes with a modular system for varied interpolation.

Setting up the backend:

Shiftr.io and python paho MQTT client are great.

ipad with WOZ control website up, shiftr visualizer on the screen with terminal running a python script to be on the pi receiving and printing signals
The same process is set up for controlling the expression on the phone

Physical Form:

Preparing for exhibition:

--

--