Emotion Classification: at the heart of Affective Computing
Welcome to our first post! While our neural networks are working, we are starting our blog after we have been in Emotion AI for more than 2 years. We have meticulously searched through the landscape of emotion recognition and affective science, have talked to the leading experts in the industry, have worked out our own approaches and vision, and are ready to share that!
Emotion AI, apart from the obvious connection to machine learning and neural networks, arises from emotion science. There are several challenges posed in the field, the biggest probably being the classification of emotional states. For instance, the process of annotation, — that is matching visible facial expressions and other non-verbal cues to the proposed list of emotions and affective states, — directly depends on the accuracy and comprehensiveness of emotion classification.
Today three approaches to emotion data categorization are commonly used: discrete, dimensional, and hybrid models that combine the both.
The discrete emotion approach relies on the categorization reflected in the natural language. Each emotion is associated with a semantic field — a particular meaning or a set of meanings we ascribe to some affective state. The theory of basic emotions is one of the most famous examples of the discrete approach.
First reference to something similar to what is meant by basic, or primary, emotions can be found in early philosophical texts, like those of Greek and Chinese cultures. Plato in the Republic considered emotions to be one of basic components of the human mind. In Aristotle’s functional theory of emotions, reason, emotions and virtue are intertwined, and every healthy human being always (or almost always) emotes and acts in accordance with reason and virtues, whether he realises this or not. Chinese Confucianism included from four to seven “Qing” — emotions, which were natural to any human.
In the 20th century the topic received a fresh wave of attention from the scientific society, and a number of authors, including Paul Ekman, the author of the most widespread account of basic emotions, proposed their own list. Ekman supposed that basic emotions should be universal, in the sense that they do not vary across the cultures. In different models we can find from 6 up to 22 emotions (Ekman, Parrot, Frijda, Plutchik, Tomkins, Matsumoto — for details see Cambria et al., 2012).
The concept of basic emotions is a debatable issue (see, for example, Barret & Wager, 2006; or Crivelli & Fridlund, 2018). A number of studies have shown the connection of basic emotions with the activity of individual brain structures (for instance, Murphy et al., 2003 and Phan et al., 2002), although in other works such a correlation was not found (for criticism see Barrett & Wage, 2006). Interestingly, some studies on emotion perception in isolated ethnic groups do not support the hypothesis of cross-cultural universality of emotions. The Trobrianders of Papua New Guinea are one example (see Crivelli & Fridlund, 2018 and Gendron et al., in press). When looking at the photo of a ‘fear face’, the Trobrianders understood it as a threat display instead of an intended fear expression.
Atlas of Emotions proposed by Paul Ekman: http://atlasofemotions.org. Surprise was removed from the initial list of 1999.
Today many solutions for affective computing are based on the discrete accounts of emotions, and include on average only basic emotions, most often the classification proposed by Ekman (for example, in that of Affectiva, the pioneer of Emotion AI). That means automatic systems are only trained to recognize a rather limited number of affective states, though in life we constantly undergo a broad range of emotional states, including complex mixed emotions, and use numerous social signals (like gestures) to supplement interaction.
Another approach — dimensional — represents emotions in coordinates in a multidimensional space. Since this space is integral, there are emotions that are of the same nature but vary according to various parameters. In affective science, these parameters (or dimensions) are more often valence and arousal (like in the RECOLA dataset by Ringeval et al.), and sometimes intensity of emotions. Thus, sadness can be seen as a less intense version of grief and a stronger one of pensiveness, at the same time being more similar to disgust than to trust. The number of dimensions may vary across the proposed accounts. In Plutchik’s wheel of emotions there are 2 dimensions (similarity and intensiveness), while Fountaine postulates 4 dimensions (valence, potency, arousal, unpredictability). Any emotion in the space will possess a number of characteristics represented by the magnitude with which it is present in a particular dimension.
Hybrid accounts combine both discrete and dimensional approaches. A good example of a hybrid model is the Hourglass of Emotions proposed by Cambria, Livingstone, Hussain (2012). Each affective dimension is characterised by six levels of the strength of an emotion. These levels are also labelled as a set of 24 emotions. Thus, any emotion can be seen both as a state and as a part of continuum, connected to other emotions by non-linear relations.
Emotions in Affective Computing
So why emotion classification is particularly important for affective computing? As we have mentioned at the beginning of the article, annotation is deeply affected by the approach we adhere to. To train a neural network to recognise emotions, a dataset is needed. But it fully depends on us, humans, what emotion we ascribe to a particular expression — a classification model we choose.
There are several annotation tools that can help researchers today. These are ANNEMO (Ringeval et al.), used for dimensional models, ANVIL (Kipp) and ELAN (Max Planck Institute for Psycholinguistics), used for discrete systems. In ANNEMO, there are two affective dimensions: arousal and valence, with values ranging from -1 to +1. Thus, any affective state can be assigned values characterizing its intensity and positivity/negativity. Social dimensions can also be rated using 7-point scales on the five dimensions: agreement, dominance, engagement, performance and rapport.
ANVIL and ELAN allow to include personally determined filters to mark audio-visual emotional content. Filters, or annotations, can be words, sentences, comments, or any other textual content relevant for the description of an affective state. These annotations are static in nature and cannot be assigned magnitudes.
No system is utterly prefered over the other, and their usage depends on the goals. Dimensional models allow to avoid the famous problem when some words exist in some languages, while some may not have words to describe these emotions — that is what makes the process of annotation context- and culturally-dependent. Still, discrete models are a useful tool for emotion categorization, since it is difficult to objectively assess the magnitudes of the dimensions like valence or arousal, and different annotators will provide different estimates.
By the way, emotion classification is a must not only in the field of emotion recognition, but also in the field of emotion synthesis. Consider robotics. Emotional spectrum available to a robot can be integrated into a multidimensional emotion space. Affect system — the system of emotional states a robot can switch between — of one of probably cutest robots in the AI industry — MIT’s Kismet — is based on the multidimensional approach. Each dimension of the affect space (arousal, valence, and stance, that is its readiness to communicate) is mapped to an expression space where each dimension has a characteristic facial posture. As soon as the needed magnitude is reached, the robot will express a different emotion.
- Barrett, L. F. & Wager, T. D. (2006). The structure of emotion evidence from neuroimaging studies. Current Directions in Psychological Science, 15 (2), 79–83. doi: 10.1111/j.0963–7214.2006.00411.x
- Cambria, E., Livingstone, A., Hussain, A. (2012) The Hourglass of Emotions. Cognitive Behavioural Systems,144–157.
- Chew, A. (2009). Aristotle’s Functional Theory of the Emotions. Organon F 16 (2009), №1, 5–37.
- Crivelli, C., & Fridlund, A. J. (2018). Facial Displays Are Tools for Social Influence. Trends in Cognitive Sciences, 22(5), 388–399. https://doi.org/10.1016/j.tics.2018.02.006
- Ekman, P. (1999). Basic Emotions. In T. Dalgleish and M. Power (Eds.). Handbook of Cognition and Emotion. Sussex, U.K.: John Wiley & Sons, Ltd.
- Fu Ching-Sheue (2012). What are emotions in Chinese Confucianism? https://www.researchgate.net/publication/267228910_What_are_emotions_in_Chinese_Confucianism?
- Gendron, M., Crivelli, C., & Barrett, L.F. (in press). Universality reconsidered: Diversity in making meaning of facial expressions. Current Directions in Psychological Science.
- Harmon-Jones, E., Harmon-Jones, C., Summerell, E. (2017) On the Importance of Both Dimensional and Discrete Models of Emotion. Behav Sci (Basel). Sep 29;7(4)
- Murphy, F.C., Nimmo-Smith, I., & Lawrence, A.D. (2003). Functional neuroanatomy of emotion: A meta-analysis. Cognitive, Affective, & Behavioral Neuroscience, 3, 207–233.
- Phan, K.L., Wager, T.D., Taylor, S.F., & Liberzon, I. (2002). Functional neuroanatomy of emotion: A meta-analysis of emotion activation studies in PET and fMRI. Neuroimage, 16, 331–348.
- Plutchik, R. (2001) The Nature of Emotions. American Scientist 89(4):344
- Ringeval, F., Sonderegger, A., Sauer, J., & Lalanne, D. RECOLA & ANNEMO: http://diuf.unifr.ch/diva/recola/annemo.html
- Kipp, M. ANVIL: http://www.anvil-software.org/
- Max Planck Institute for Psycholinguistics. ELAN: https://tla.mpi.nl/tools/tla-tools/elan/
- Emotion, Stanford Encyclopedia of Philosophy: https://plato.stanford.edu/entries/emotion/