For our company, this autumn started off with exciting events: Neurodata Lab became the Gold Sponsors of the most prominent event in the world of affective technologies — the International Conference on Affective Computing & Intelligent Interaction (ACII 2019). Moreover, in conjunction with the conference we organized a workshop together with Alessandro Vinciarelli, Full Professor at the University of Glasgow.
Held biannually, this year ACII took place in Cambridge, the UK. The conference attracted over 400 participants from all over the world, with keynote and invited speakers represented by such renowned academics as Rosalind Picard, Lisa Barrett, Simon Baron-Cohen, and Thomas Ploetz. Moreover, research on human-machine interaction and systems working in this field traditionally draw attention of tech corporations thus the conference was co-sponsored by Apple, Microsoft, Disney Research, SoftBank, and Neurodata Lab.
As a part of the conference, Neurodata Lab organized SEAIxI — an International Workshop on Social & Emotion AI for Industry, held jointly with the University of Glasgow. Among the speakers there were people from universities and labs, research units of big corporations and various startups that create innovative products based on emotional technologies. SEAIxI became one of the most popular workshops of the conference as it provided a possibility to discuss issues that concerned most of the participants — the problems of closing the gap between academic research and industrial usage of affective technologies (such as systems capable of analyzing emotional states and dealing appropriately with human attitudes, feelings, and expectations).
We invited both the participants who aimed to advance science, and those who work in a set of industries and actively apply the results of this research. Here are some of the highlights of their talks.
Health & Well-being
One of the workshop’s keynote speakers was Charles Nduka, a reconstructive plastic surgeon and Founder and CSO of Emteq, a Brighton-based startup which makes wearable VR devices that read facial movements and give feedback on facial muscles activity. Nduka described the evolution of his project: originally designed to help his patients exercise facial muscles, later his VR device was transformed into a pair of high-tech glasses capable of tracking facial expressions and behavior of patients with Parkinson’s disease.
The information collected by the device helps doctors adjust medications and monitor the progression of the disease. In the future, Nduka intends to advance his technology to the point where it would be able to tackle depression by monitoring emotions and tracking patients’ mental health. In addition, Nduka gave a talk on the importance of understanding behaviors in context, particularly when the individual is away from an electronic device or where camera usage is unsuitable or inappropriate.
Another contributor whose talk was focused on Health and Well-being was Daniel McDuff, SEAIxI’s Keynote Speaker currently working for Microsoft Research. His talk contained several aspects. First, he shared the details of a research undertaken at Microsoft where the scientists recorded daily user-computer interactions and compared them with facial expressions of same users captured by laptop cameras during the day. The study was aimed at developing software that would support users’ wellbeing. They attempted to identify the emotions generated by different tasks or products installed in Windows OC (such as time spent at work station, dealing with emails, etc). This data would help Microsoft build more natural user interfaces that would let people increase their productivity and spend time at the computer with less harm to their health.
Second, he presented an overview of state-of-the-art approaches for emotion sensing and synthesis that can be used to advance human-computer interfaces (such as speech synthesis). According to McDuff, such systems would leverage behavioral and physiological signals, including emotion-aware natural language conversation systems, cross-domain learning systems and bots with intrinsic emotional drives.
The third contribution on this topic came from a group of scientists from Wearable Information Lab at the Aoyama Gakuin University: Kizito Nkurikiyeyezu, Anna Yokokubo, and Guillaume Lopez. In their study, they attempted to develop an efficient thermal control system for a human body. Their research was focused on neck-coolers: they explored how they affected people’s thermal comfort perception and proposed a method that would deliver optimal thermal comfort depending on people’s heart rate variability (HRV). The scientists took into account the fact that thermal comfort is person-specific and frequently depends on unpredictable circumstances, thus only person-specific model would solve the problem. However, this very issue makes personalized models too expensive for mass production and requires collecting extensive training data; yet the scientists managed to create a hybrid, cost-effective, and satisfactory technique that derives a personalized person-speciﬁc-like model from samples collected from a large population. Such technique can be adopted in office spaces to make the work of employees more comfortable; moreover, this approach makes such offices energy-efficient.
A couple of speakers dedicated their talks to Speech Technologies. Dagmar Schuller, CEO and co-founder of audEERING, described how technologies capable of detecting emotions, age, gender, fitness and other body parameters drawn from audio signals using deep learning techniques can enhance user experience. In particular, Schuller described devAIce — an AI technology that integrates into devices and improves customer experience in real time by hearing and identifying the surroundings. In addition, she mentioned another solution created by the company — a Unity plugin that detects real-time agitation levels of game players thus enhancing their gaming experience.
Another talk in this section was presented by a group of researchers from the University of Texas at Dallas — Michelle Bancroft, Reza Lotfian, John Hansen, and Carlos Busso. Their study explored speaker verification models where it is important to explore the intersection between speaker and emotion recognition tasks (such as 911 calls, etc.). Within the framework of their research, they collected a pool of sentences from multiple speakers (132,930 segments), where some of these speaking turns belong to 146 speakers in the MSP-Podcast database. The framework they built trained speaker veriﬁcation models, which were used to retrieve candidate speaking turns from the pool of sentences. The emotional content in the sentences was detected using state-of-the-art emotion recognition algorithms. The consequent experimental evaluation provided promising results where most of the retrieved sentences belonged to the target speakers and contained the target emotion.
Automotive & Smart Mobility
The Automotive section was represented by Daniel Lopez-Martinez, Neska El-Haouij, and Rosalind Picard (working at Harvard and MIT divisions).
Affective automotive user interfaces are being developed by most large automotive companies, recognizing that the interaction with the driver can impact driver safety not only via distraction, but also via mismatched affective state which may be caused by the navigation system’s voice. Consequently, an interface that has more humanlike social-emotional intelligence would know how to change its tone of voice to optimize safety.
The research undertaken by Martinez et al. was focused on empathic automotive user interfaces that monitor the driver’s emotional state and can inﬂuence or get in contact with them in moments when a driver may fall asleep or lose focus, thus improving the overall safety of a driving process. The scientists proposed a multiview multi-task machine learning method for the detection of driver’s affective states using physiological signals. The method they suggested implies personalized machine learning since personal reaction to stressful situations greatly varies among the population. The scientists employed a multi-view multi-task machine learning framework that is able to account for inter-subject inter-drive variability in affective responses to the driving experience.
The scientists evaluated their models on three different datasets containing real-world driving experiences, and reported that the results indicated that their approach signiﬁcantly improved the model performance.
Catherine Pelachaud, SEAIxI’s Keynote Speaker from CNRS — ISIR — Sorbonne Université gave a talk on how we can build socially aware virtual interaction partners. In her study, Catherine stressed the importance of using verbal and nonverbal means of communication to adapt to their partners, show empathy and even manage the impression they produce. She also elaborated on the platform of a virtual character capable of displaying a large palette of social attitudes and emotions, and talked about the recently developed computational models capable of adapting their behaviors in real-time to convey specific intentions.
Emotion Recognition from Different Modalities
Antonio Cammurri (University of Genova), a special guest at SEAIxI workshop, talked about Multi-timescale Sensitive Movement technologies and the EnTimeMent project. As a founder and scientific director of InfoMus Lab and of Casa Paganini — InfoMus Research Centre of University of Genoa, he presented his research related to affective analysis of bodily movements in interactive dance and music systems and art projects.
Another study was undertaken by Hang Li, Siyuan Chen and Julien Epps (Dept. of Electrical Engineering & Telecommunications, The University of New South Wales, Australia). They developed a method for identification of 7 basic upper facial action units by analyzing just parts of face — such as infrared eye images. This approach let scientists achieve 78.8% accuracy and could be used for wearable applications of automatic facial expression analysis.
Ethics of Emotion AI
Furthermore, a presentation on ethics, emotions, and AI was given by Andrew McStay (Bangor University) who emphasized the importance of stringent regulation in emotion recognition sphere to protect the public interest. In addition, McStay drew attention to already existing practice where certain retailers have already begun to record their customers’ behavior. Interestingly, McStay’s talk was based on several UK-based surveys aimed at identifying how people felt about emotion capture. He stressed that national regulators needed to consider the ethical consequences of usage of emotion recognition technologies.
Affective Computing Industrial Landscape
The workshop was finalized by the presentation of Olga Ingatyeva, fromIPERF, International Institute for Research Performance and Innovation Management e.V. Berlin, Germany. She gave a talk about business models in Affective Computing. In particular, Olga suggested how to commercialize the technical solutions offered by other participants of the workshop, and covered possible ethical issues related to the process. In addition, she interviewed numerous startups and scanned the market landscape (mostly US and Europe) in order to understand where Emotion AI industry is developed the most and which types of data different industries are using. Besides, she studied their revenue streams and key partners.
Overall, the workshop was enlightening and educational. We aim to organize such events more frequently in the future and pay special attention to different industries with the highest potential for usage of Emotion AI.
Author: Francesca Del Giudice, PR Specialist at Neurodata Lab.