Audience Emotions Second by Second

Julien YoufirstAI
youfirst
Published in
5 min readMay 18, 2017

--

In February 2017, an entertaining talk show Good old times produced by National TV of Slovakia was tested with Youfirst— an emotion AI software that tracks facial expressions of viewers while watching a video from the comfort of their homes.Youfirst is cooperating with established ESOMAR agencies or works directly with influencers to analyze video content — so far it has analyzed over 2 billion frames.

Design of a facial coding study

Youfirst uses computer vision and machine learning algorithms to extract emotions from facial expressions of viewers. Views are aggregated to form a statically relevant picture of the emotional experience for each moment in the video. Emotional states that are tracked represent the universal building blocks of a human emotional response — anger, disgust, fear, happiness, sadness, and surprise*.

* Ekman, P. (2001). Facial expressions. In Blakemore, C. & Jennett, S. (Eds.), Oxford Companion to the Body. London: Oxford University Press.

Emotional Facial Expressions are recognized by specific combination of facial muscles called Action Units — overexposed for better understanding.

According to predefined sample criteria we have invited a group of respondents from an ESOMAR online panel to take part in a facial coding research. After agreeing with the purpose and methodology of the research, an automated test of recording conditions was executed. The webcam, optimal recording conditions and speakers were tested. Our data sample consisted of 50x 1-hour-long recordings. Emotion data were completed with a qualitative questionnaire. All done from the comfort of respondents’ home with no need to install anything.

An automated test of recording conditions runs on the respondents’ computers.

Outcome

The software does the emotion analysis on the frame-by-fame level so the amount of data that comes from this type of research is obviously huge. In this research we have gathered 3 million frames. Each frame carries information about 6 emotions, distance and angles of the face along with data about socio-demographic categories and responses from the questionnaire. This leads us to one key advice: specify your questions. We present here 3 that were specified for RTVS:

1/ What is the flow among the two anchors? Building the emotional engagement of the viewer

Human emotions are built on stories and our interpretations of these stories. From emotion data we conclude that an emotion is increasing naturally with a good narrative and gradation. If the narrative is discontinuos, emotion curve does not reach high-enough values. Discontinuity may be caused by non-authencity, problems with understanding or raggedness.

It was the uneasy impression and un-cooridnated interaction of that the anchors that did not allowed the audience to step out of the smiling zone into amusement. The level of positive emotion curve jumped up here and there, however, there was not a moment when the anchors were able to build up upon the hosts’ narrative or each other’s.

2/ Which are the critical moments we should be aware of? Beware of different stages in channel surfing

Reacting emotionally basically signals that there is something personally significant going on. Not evoking emotions in our audience means to loose them. There are 3 types of crtitical moments that may arrise depending on when they happen. The first one is an unengaging beginning. If you loose your audience here, you rist that they might find another program to entertain them that evening. But they might come back later on. The second one is the unengaging mid-program moment. If you loose your audience here, they are already leaving with an opinion, that might be unfavorable. The last one is the ending moment. You want to hype the ending to make sure they come back — do not let them slip back to neutral.

The analysis uncovered a number of critical moments in the program when the engagement level was low. The beginning of the program and the last quarter did not evoke emotional reactions. The musical insertion that were supposed to serve as entertainment did on the other hand evoke frowning.

3/ What are the power moments we should build upon? Power moments for trailer creation

Youfirst is able to automatically detect moments where the audience responded in higher intensity called power moments. It is even possible to identify exact quotes. These do a good job in trailers and targeted campaigns.

Our research identified that it was especially one host –a gentleman-like and witty old actor and humorist MS- who was able to engage with the audience. Here are two of amusing comments that were identified by Youfirst.

“You look wonderful… younger indeed. But how MA rejuvenated!” Talking to one of the anchors who used to have a much older anchor partner MA.

“Are we going to get some money for this?” ML asking about getting money for being super successful in guessing the year when certain shots were recorded.

Unfortunately, the program anchors did not build upon these gags.

Successful implementation of results

The results of the research brought exact propositions for changes, which were used in the program refresh. The actual version is now available on TV. The dramaturgy of the program has changed and now supports interaction and storytelling. ML became a stable host of the program. The musical insertions were completely permutated.

MS = Milan Lasica, became a stable host in the talk show Golden Times

After refresh of the program that was in line with facial coding research conclusions and recommendations the performance of the talk show Golden times is continually growing. Comparing viewers’ behavior fall 2016 to spring 2017:

Mean Viewership: +9%

More people watch talk show Golden times

Audience Loyalty: +14%

Viewers watch talk show longer Golden times

Source: PMT/TNS.sk

Benefits of a facial coding research

Emotion data offer an analysis of spontaneous reactions second after second. All this happen in a fast forward mode, because the audience is approached on their devices and the analysis is done automatically by a software. Emotional quality along with intensity is identified. All this with targeting specifications.

--

--