Written All over Your Face

How tracking the emotions of users can be scary and fascinating at the same time

Franziska Roth
Zalando Design
8 min readJun 17, 2019

--

Illustration by Not Flipper

In one of Zalando’s most successful advertising campaigns, Scream for Joy (2012), we highlighted the excitement that the delivery of a new pair of shoes can bring. While we are still trying to live down the legacy of that campaign, it does highlight the emphasis Zalando has always put on inspiring emotion in our customers.

Shopping for clothes can be a highly emotional experience. There’s happiness,This is exactly what I was looking for!’. There’s frustration, ‘I can’t believe my size is sold out’. There’s insecurity, ‘I’m not sure if this color is right for me’. And there’s even anger, ‘I just can’t make this trend work for my body type’.

We try to ignite moments of happiness in our users throughout the entire shopping experience, while also trying to meet any concerns and frustrations that might be encountered along the way. Problem-solving is often accomplished by appealing to customer’s rationality…helping them to better understand sizing or showing how to combine outfits, for example. Yet, most decision-making is emotional [1], not rational, and we’ve come to realize we need to learn more about how we measure and assess emotions related to online shopping behavior.

…how might we track users’ emotions at scale in real-time while they are actually using Zalando, allowing us to immediately react to their emotions?

After I pitched the opportunity in one of Zalando’s hackweeks, it became part of our innovation accelerator. We sought to find a way to measure emotions in an automated manner in order to personalize the experience. A customer’s browsing experience would not be based simply on past behaviour, but also on a comprehensive understanding of what they actually love. For this, I got to lead a team of research engineers, software engineers, and designers for the time of our sprints (Image 1). In those sprints, we asked ourselves: how might we track users’ emotions at scale in real-time while they are actually using Zalando, allowing us to react immediately to their emotions, taking personalization and communication with our users to the next level?

Image 1: The team during one of our Design sprints

The first obvious step was to tackle the data and assess its quality. But, since tracking emotions comes with some privacy concerns, we also wanted to dig deep into the fears associated with tracking technology. Therefore, we also conducted qualitative interviews in our user research lab.

The Technical Setup and Prototype

Before digging into the results, I would first like to discuss the technical setup. The first challenge we faced was deciding on what technology should be used to track emotions, and settled on facial recognition (headbands would be too intrusive and wristbands wouldn’t give an accurate insight into what emotion we were dealing with).

Due to time limitations, we decided to use an Amazon algorithm that analyses a person’s facial micro-expressions. It clusters emotions (happiness, anger, sadness, disgust, etc.) based on visible expressions, and then provides a confidence score on the accuracy of the clustering.

We started by asking users to browse Zalando in our user research lab while being recorded with the laptop camera. Via a Chrome extension, we connected to Amazon Web Services and used Amazon’s API to detect emotions in the images. The API in turn revealed the emotional data in real-time back to our prototype, a dashboard that displayed changes based on the emotions users demonstrated when viewing particular products (Image 2). We conducted six interview sessions lasting about 45 minutes. Interviewees were asked to first shop for a gift, and then something for themselves.

Image 2: Our Prototype Setup

We created the emotion dashboard so we would be able to observe how users interacted with the data (Image 3). Did they find it scary or uncomfortable to see their emotions displayed? Did it make sense to them? We had hypothesized that tracking emotions could provide a similar benefit to tracking activities, like running, for example. Users will give up access to their personal data in order to get something back.

For the sake of simplicity we displayed only the products associated with positive emotions, making the dashboard a sort of enriched wishlist. Due to technical reasons we didn’t focus on negative emotions…we collected this data but didn’t share it with the users.

Image 3: Emotion Dashboard

THE RESULTS

More data can lead to more noise, which is why it is necessary to enrich, and thereby humanize quantitative data with quotes, stories and other data points.

Emotions observed

Before we digged into the tracking data, we first wanted to see which emotions we can observe in an interview without technology. Image 4 illustrates the teams’ observations of facial and verbatim expression from several interviews, and does not include tracking data. It covers the journey from the Homepage to the Product Catalog to a Product Detail Page on the Zalando website. Positive emotions are expressed in green, negative in blue.

Image 5: Emotion Journey 1.0

It is important to note how many up and down curves are present. Human emotions can change in milliseconds, and can also be mixed…meaning researchers’ visible assessments may not always be entirely accurate. During decision-making moments in the product catalog or on the product detail page, emotional changes were particularly pronounced.

Next, we looked at our tracking data. Image 5 shows a sample of our tracking data for happiness. Each line represents a user, with the left axis depicting the likelihood that the respective emotion (happiness in this case) was actually visible in the image. The horizontal axis shows duration of the session. The lines in this graph look more extreme than those depicted in the human observation as shown in the Emotion Journey Map 1.0, supporting the point that emotions can change in milliseconds.

Image 5: Tracked Happiness

In Image 6, we flattened the data in order to show it in a similar format to Emotion Map 1.0. What we ended up with is likely closer to the “truth” of what actually occurred. Removing the extremes, the data became less dispersed than was shown in Image 5. More data can lead to more noise, which is why it is necessary to enrich, and thereby humanize quantitative data with quotes, stories and other data points (e.g., where a person is from, what they care about, what their motivations were, etc.). By humanizing data we are given a more accurate interpretation and, in turn, more actionable data.

Image 6: Emotion Journey 2.0

User Reactions

“Oh yeah — the time I spend there — true. It makes sense.”

Users trusted the results, because OMG — it’s science! Several users, for instance, were unclear how to interpret the dashboard, but once they saw the graph, even if they didn’t particularly understand it, were convinced of the result.

It was also remarkable that even though we told users at the beginning of the session that we would be tracking their facial expressions, they still thought we used other methods to reach our results: eyetracking, measuring their “decisive clicks” or time spent on page, researchers analyzing video recordings in real time or listening in on them.

“How did you..? This is exactly the thing which I had in mind. How did you know?”

Users felt understood, and were often shocked at how right the algorithm was in guessing their preferred products. This was visible when they scrolled through the emotion dashboard and commented on the products shown and order in which they were displayed.

“My face belongs to me.”

Privacy concerns were less pronounced than anticipated. We expected a strong backlash from our scenario, even in a lab setting. But surprisingly, even our most critical user (who uttered the quote above) reported that she had forgotten about the emotion-tracking after a while.

Of course, privacy concerns that may or may not be expressed in a lab situation might look very different in the wild.

A QUESTION OF ETHICS

No matter how much we love data or curious we may be, we always need to ask, “is this really best for the user?”.

Within our team we spoke a lot about whether our research was ethically right or wrong. On the one hand, the machine was just doing what people do all the time: reading the emotions others display to the outside world. The data was beautiful really — there have been many times in my life I would have preferred such data to self-reports. And we had so many people from our personalization, advice and inspiration teams asking for the data, that there was an obvious need for us to better understand our users on an emotional level.

But, at times, it also felt as if we were overstepping a boundary. We encountered a clash between curiosity and the wish to do something good with the new data (ease user pain, create joy), with what we felt might be an intrusion into one of the last personal facets not yet being measured at scale: emotions.

I don’t have an answer to this ethical quandary, but I do know that as researchers we must be the vanguard of users in these conversations. No matter how much we love data or curious we may be, we always need to ask, “is this really best for the user?”.

As user research and design are no longer “just” about UI and UX, but also about algorithms and tracking and analyzing data, we need to take a step back and reflect on the balance of providing a great user experience while also protecting users from the seeming insatiable hunger for data.

[1] See e.g., Greco & Stenner (2008) for an overview of how emotions influence decision-making.

Dr. Franziska Roth is a senior user researcher with a quantitative focus for Zalando.

Thanks to Amicis Arvizu, Dr. Géraud Le Falher, Adrian Dampc, Francesco Mucio, Chris Szafranek, and Philipp Erler for joining and/or mentoring this project.

Thanks to Eileen Bernardi for her support on writing this story.

--

--

Franziska Roth
Zalando Design

I am passionate about helping people to thrive, breaking down barriers between different kinds of methods or data, and making complex ideas easily measurable.