The Poetics of Visualizing Conversational AI: Our Process and Progress

Part 1: Design a screen-based interactive AI counseling application that responds to content and emotions being expressed in conversations between counselors and clients.

Session 01 // But first, tea and a postmortem.

Monday 22 August 2018

In the afternoon, we kicked off the project by reflecting on our previous work, and discussing our working styles and personal goals for this project over warm beverages.

Personal Goals

Cathryn would like to consider what the boundaries of what can be expressed visually with AI are. She’d like to find ways to test the system to the point of breaking it, in order to “learn the bounds of the material.”

Ulu is interested in how to visually represent a cognitive consensus. Most conversational agents’ visuals reach a level of complexity that involves receiving, interpreting and transmitting, as well as showing confirmation or a lack of understanding… and that’s it. But in conversation, and in a counseling setting, nuances of understanding and interpretation are so important: head nods, utterances, gestures that show continuous understanding. How might an AI system show signs of consensus with the user?

Gautham wonders about the identity of the AI counselor: what it feels like, speaks like… what is its personality? How can this be expressed not just visually, but emotionally? What does it mean to have a conversation with AI?

We also want to think about how records of conversations are kept and displayed in a sensitive manner, even between the agent and the client. There is a great deal of feelings of judgment associated with visiting counselors, a stigma that makes AI counselors preferable in some respects.


We think it’s fortunate that these goals are not necessarily in line with one another, but that they’re not in conflict either; in other words, we can produce something that’s reflective of and respectful to each of our personal imperatives.

We then used these goals as heuristics for evaluating which script to use as the basis for our interaction. We felt that Transcript 1 would allow us the greatest opportunity to experiment in these areas.

In transcript 1, the client is moving through a range of emotions related to a breakup. There is a range of banter between client and therapist, with indications of careful listening, processing, and recalling that will allow us to explore a range of visual explorations.


Session 02 // Person(a)building

Tuesday 23 October 2018

Following up our first session, we met to dig into our script a little deeper and consider what drew us to it, and what that could tell us about our prospective user. But first…

How will we prototype this?

This is still unresolved, and that’s okay. Within our team we have comfort and relative expertise in both mocking up in AfterEffects and coding in JavaScript, but need to take more time to consider the pros and cons of each option, and how we can best achieve what we initially discussed. Regardless, we can prototype in the mode we’re most comfortable in to express our ideas internally, and consolidate with the most logical method for our deliverable.


I took a bad photo of a beautiful whiteboard, and tried to make up for it.

Our User

We then spent some time building our person. We based him not just on our script, but on rational motivations for someone wanting to choose an AI counselor over a human.

From this, we began to build our first protagonist (the second being the counselor we will visualize as our deliverable):

Based on the script, we envision him to be in his late 20s to early 30s and unmarried. His hobbies indicate that his is middle class, and his family’s attitude towards mental health indicate some lack of familiarity, perhaps due to cultural differences or environment. Because of the breakup, we think he might be experiencing undiagnosed anxiety, which may stem from, or be exacerbated by, his recent breakup.

AI is now allowing him to try out counseling with a low barrier while his problem is fairly small, before it becomes a crisis. It also provides him an outlet he might not otherwise have — without trusted humans to vent to, where else could he turn? But perhaps most important, we imagine he decided to use an AI counselor to escape judgment, both from his inner circle and from a human counselor.

Big Questions

Gautham raised an excellent point about that last bit: if AI is a shelter from fear of judgment, how does AI help the patient address that fear? This question could be very valuable to consider when we move to part two of the project, considering how the AI counselor would “hand off” care to a human in a time of crisis.

Individually, we’ll do some research to answer some questions about AI, mental health professionals, interpersonal nonverbal communication and the spaces where they meet:

  • What counseling heuristics might exist for how mental health professionals compose their expressions, tone and body language while in session with a client? Cathryn has a contact who might be able to point us in the right direction. She’ll also consider other ways of visualizing emotions.
  • How are existing AI counselors addressing some of the things we want to explore, with regard to personality and emotions, degree of recordkeeping, and where are their shortcomings? Ulu’s got some apps downloaded and will investigate.
  • What other inputs besides voice and typing are being used in the AI space that might be used to find a cognitive consensus? Gautham will find out.
  • Ulu read somewhere that AI counselors are actually preferable to human counselors in certain instances. She’s gotta find where she saw that.

Session 03 // Extracting hidden communications

Wednesday 24 October 2018

Feedback

We began the session by running through our discussion with Q and Daphne.

We learned that we have the freedom to tweak our script for the medium, but are asked to predominantly follow it. This is likely a good thing — by following an interaction with an actual human therapist, we are more likely to give our AI agent a nod to the best of what humans impart on the therapy experience.

In Q’s feedback, he highlighted the importance of considering the visual representation of AI in real time conversations — that the visuals are crucial for telling the user that the AI is more than a machine, and allow for the boundaries of AI to change.

Daphne asked us to consider the physical context of use as we created storyboards that would have an influence on how subtle or overt the interactions need to be — how “loud” would things have to be in public versus at home?

Q also pushed us to consider integrations with other parts of your life — how would interactions be impacted if the agent could see your calendar, for instance? Daphne also asked us to for an opinion over whether the system is learning about you as you use it, or if it’s a one-off tool for use in times of crisis.

Looking ahead to the final deliverables: In your final output, you’d have some setup about what this is, and who the person is. You can think of the script as a scenario for demonstrating your system. Using the existing AI bots as an example, they’re all interactive through typing. That’s a big difference from what we’re doing. There are less efforts about how AI should work, and that can change the user experience. We’re exploring unventured territory. If you focus on speech and visualization, would it help the user reflect on what they say? If the purpose is not to search and browse, if it deals with personal issues, how should it look?


Research findings

We then shared our research findings to one another to consider.

Principles of Therapy

Visualizing Emotions

There’s a lot out there. Dan Lockton gathered a lot of resources for his New Ways to Think mini:

Interactions

Look Away: Thiiiiis is creepy. But it’s definitely an interesting interaction paradigm.

Alice guesses the emotion of the face you’re drawing. An interesting consideration for interactions besides voice.

motionEmotion is an emotion & gesture-based arpeggiator and synthesizer. It uses the webcam to detect points of motion on the screen and tracks the user’s emotion. It draws triangles between the points of motion, and each triangle represents a note in an arpeggio/scale that is determined by the user’s emotion.

Inputs that consider emotions

Cathryn also found a handful of useful APIs we can incorporate into our product.

https://trackingjs.com/docs.html#trackers
https://itnext.io/face-api-js-javascript-api-for-face-recognition-in-the-browser-with-tensorflow-js-bcc2a6c4cf07
https://github.com/auduno/clmtrackr
http://blog.affectiva.com/emotions-in-your-web-browser
https://developer.affectiva.com/metrics/

Gautham also discovered this interesting tool for detecting verbal tone and coding intensity with color:

To add: comparative analysis of counseling chatbots, applications where AI chatbots are preferable to therapists


We spent the remainder of the session picking apart the script to find where our preliminary visual explorations should take place:


Session 04 // Cross-pollinating forms

Sunday 28 October 2018

We met on Sunday to discuss our early form ideas, what we learned from doing them, and to incorporate those insights into our storyboards. We considered questions surrounding both how the AI agent is reflected, and how the user’s feelings might be processed.

Ulu’s early form ideas

Just Eyes

Gautham’s explorations

Gautham explored three categories.

Abstract/Metaphorical: Nebular, Siri-like, technological in form, allows for diversity in visualizations,

Shapes’ arrangement to communicate diversity of emotions

Mapping against a familiar object, like an iceberg

Emotion Calendar

Symbolic:

Colors overlapping to represent complex emotions

IBM verbal interpretations

Text Based:

Lingering Questions:

How powerful is text? If we use it, how can it be used to visualize emotions? How else could we constrain/enable input?

Gautham’s explorations

We agree that having the visuals build is good

Cathryn’s Explorations

Hyper focused on how form conveys emotion. Thought about “mirror neurons,” empathizing based on your conversation partner’s face. How can we present an abstract form that they can map onto, while imbuing some of that mirroring?

Inspired by a zine that uses a continuous line to show a face. Play with movement and shape to define the emotion that’s being reflected back to the person.

Building paradigm: you see the progression of emotion over time, emotions changing based on lines. Each dot is a marker to indicate that an issue has come up before, and can use to navigate back through that

Gautham: potential to use the dots as a way to identify triggers and dispellers, basically to use the visualization as a form of CBT


Session 05 // Peer feedback on form

Monday, October 29th 2018

We began class by exploring last year’s projects: students designed vending machines.

Feedback and discussion with Daphne and Q:

  • What is the AI’s persona? The AI must build a relationship with the user, but not necessarily directly reflect emotions. Allow a form that was abstract enough to allow a person to map feelings onto, that could slightly mirror client emotions. Cat: gestural face
  • Kiki and Booboo experiment: people associate hardness and softness with different line qualities
  • Building a consensus of what the emotional state of the conversation is. Ulu: eyes, Gautham: shape and color
  • Drawing: anxious about drawing own counselor (choose to avoid barrier to entry for people who aren’t comfortable
  • Represent both AI and emotional state
  • Eyes: determine what societal consensus of what they look like mapped to emotion
  • Test look of UI versus certain emotional states.
  • How would user navigate reflection process?
  • Likes how, when certain emotion brought up, can relate to past conversations and experiences
  • Choosing a next direction: abstraction versus defined shape
  • Whether the shape not untangling might cause anxiety
  • Ulu: How might we design affordances for a tool?
  • How can user give input to validate state of AI?
  • Don’t need to prove it’ll be effective. Just explore possibilities. Use research to validate design choices (color, shape)

Andrew Twigg critique

We had the pleasure of having another faculty member, Andrew Twigg, give critique and feedback about our concepts. Andrew asked us about the common issues we had been finding in our peer discussions:

  • How to determine if AI is understanding you? Trust in CUI understanding. How to show emotional connection with client. Consider what AI can realistically do. Use normal human responses to map emotional responses. Can add validation for user input to confirm emotion/intent.
  • Human versus abstract form (appropriate visual form). Is it ethical to represent a real person on the screen? How do you establish trust via form (cartoon may or may not be trustworthy)? What do I want my own therapist to look like? Think about the uncanny valley and how to represent human features.
  • Tracking progression, visualizing emotional state change. Gautham discussed. Limitations of color — can’t necessarily scale. How could system be personalized/adapt to user.
  • Including sound. May escalate/de-escalate intensity of emotion
  • To what degree should the interface reflect the emotion. When is it appropriate to mirror or to flip emotion? In other words, should the AI escalate an emotion versus de-escalate the intensity of emotion to calm a client?
  • Displaying other media. How do you manage screen real-estate?
  • Input types to consider. Only conversational input, no text. Face tracking is okay.
  • Non-linear conversations. Non-conclusive/clean end. What does it mean to go through a counseling process? A lot of progress bars lie by design. What does it mean to progress? How do you demonstrate progress on issues people are working through? Don’t be over-literal with the concept of progress?
  • Trains a person to help people work out their own emotions. When are they doing / saying things they aren’t in concert with good ways of working through emotion. Especially over long period of time of use.
  • Questions as useful tool to prompt user to think/feel certain things.
  • Designers should skeptical of technology they use to understand its limits.

Things to reflect about and look into:

  • Look into digital counseling services. Understand how they work
  • Look at other types of chat bots. what are they good for?
  • There is no end to therapy, reflect on our own experiences with therapy.

Next steps: Dive deep into different emotional states for 1-2 form ideas. Slideshow. Show narrow focus. Start with storyboard. Form and decisions you’re making.

If developing app: Look into Replika and X2 — show prototype of how AI would perform based on transcript.

Work session together: have constant feedback.

Debrief

Gautham: When is it appropriate to mirror/flip or de-escalate? Look at how counselors deal with emotion. Appropriateness of showing triggers. Go deeper into which triggers/where line drawn?

Ulu: Record convo on own. Figure out timing of how it’d feel. Live version of chart. Infuse what eyes are doing into forms. Add processing step

Session 06 // Choosing a visual metaphor & form exploration

October 30th, 2018

At the beginning of our meeting, we asked each other what we wanted to get out of the meeting? We needed to decide on a visual metaphor, then explore our form before our presentation for next class.

First, we decided on several states for our conversation to use as a starting off point:

  • Overall state
  • Reason
  • Associated emotions
  • Exacerbation
  • Acknowledgement of jealousy
  • Current state

Then, we created a breadth of ideas during a 5-minute ideation session:

We were all drawn to the idea of tangling and untangling, and decided on this path, or string, as a visual metaphor

Ulu and Gautham explored the visual form individually, while Cat worked on presentation.

Exploration of tangling/untangling metaphor

Session 07 // Form presentation

October 31st, 2018

In class, we presented our form ideas:

Gautham’s prototype on reflecting in a conversation

Zach gave us consequential feedback, asking if our “detangling” metaphor was too prescriptive for users. Further, the idea that people would see a line tense up as they explored their emotions caused some to ask whether it’d be more effective to allow the path to only partially be seen (for acknowledgement) and fade into the background Also, we were encouraged to dive deeper into how we’d present the way the AI captured and allowed a user to reflect on their conversations.


Session XX // Final visual crit & a new team name

Monday 12 November 2018 (Happy birthday Momma!)

[include presentation and video]

Monday was an opportunity to share our final visual form and gain insight from our guests, both in the form of feedback and observation. We were particularly moved by other groups’ use of depth of space, aural channels, and color, and although these modes of interaction aren’t necessarily good fits for the paradigm we’ve chosen, they are valuable considerations for future work involving the representation of AI.

Overall our feedback was positive, but also highlighted areas still deserving of consideration:

Q and Daphne complimented the elegance and coherency of our representation, and challenged us to consider the textual representation of user data, with particular relation to the next part of the project. They also want us to be careful with the use of color, because done wrong it could break the system. Q wondered if frequency could be shown with a line distortion as well, for example.

Zach added that we should also begin to consider the questions of cognitive consensus that we had early in this project — with AI and records of thought, there are two areas vulnerable to misinterpretation — how the AI interprets the client’s input, and how the client interprets his own input. Both are equally important to consider moving forward, especially if we anticipate how this information will be translated to a healthcare provider.

Afterwards, we reflected as a team on our progress thus far, and what we want to consider moving forward. Some big areas of importance are:

  • Giving more consideration to onboarding and how it connects to the reflection process. What’s our database, and what’s our output?
  • Privacy considerations. What to reveal to outside groups, and where does consent come in?

Andrew stuck around afterwards to provide additional feedback:

What we appreciated from Andrew was the opportunity to get a “cold read,” so to speak — rather than hearing our pitch and rationale first, he asked to confront the prototype first. He was immediately able to pick up on its intent, but, like many of our classmates, cautioned us not to be so harsh in presenting negative forms. Even for high intensity states, a slight softening of the line points would be appropriate, he said.

He also thinks there should be a little more to bring the noting and reflecting stages together visually — in its zoomed-out form, the coil doesn’t look like it’s part of the same system as the line, and perhaps some contrast adjustments could help.