Measuring User Engagement

At Indiana University Bloomington, 11.02.2018

Introduction

No matter the product is work/task-oriented or for entertainment etc., engaging experience makes users and the business/products achieve their goals better together. Thus standardizing and measuring user engagement become important for determining the gap between users’ experience and users’ as well as the business goals.

However, the term user engagement tricky to define. it can be neutrally defined as how users engage with online applications or internet services and the assessment of user’s response of the products or understood as how attractive/fascinating an online service is to its users [1]. O’Brien concludes user engagement as “a quality of user experiences with technology that is characterized by challenge, aesthetic and sensory appeal, feedback, novelty, interactivity, perceived control and time, awareness, motivation, interest and affect.”[2] from which we can see that it is a tricky term partly because a large number of aspects in user experience may count as part of it. Previous studies have revealed several characteristics or aspects of user engagement including but not limited to: focused attention[2][3], positive affect[2], aesthetics[2][4], endurability[2][5], novelty[2][3]etc. Thus, various kinds of methods and models are proposed to describe and target different aspects of user engagement.

User Engagement measures

Previous studies propose three groups of user engagement measurement methods: self- report/qualitative measures, cognitive engagement metrics, behavior metrics or web- analytics.

1. Self-report measures:

As the name suggests, self-report measures are a method in which participants describe their own perception of the experience of using the online service/ application/product, including their feelings, emotions, understanding, and attitudes. Different methods can be used to implement self-report measures, e.g. interviews, surveys and think aloud methods etc. [2] describes a specific study of using semi- structured interviews to measure user engagement. Based on the exploratory nature of their study, interviews become the best measuring method since interviews enable them “to delve into the thoughts behaviors, and the feelings of our participants and to allow them to recount a real-life experience.” As it shows, self- report measures have the advantages of direct investigation of user engagement from their perspective and high internal consistency. It is easy for researchers to extract the meaning and specificity of the behavior or experience in construct definition. Thus, self-report measures are suitable to collect direct feedbacks on positive/negative effect, interest, and meaningful evaluation of aesthetics etc. Another advantage is that self-report measures can also be carried out in different environments(lab/online setting), which offers a high flexibility.

Self-report measures, however, have several drawbacks. The results of the study rely mainly on the subjectivities of the users which may lead to bias. The final results are constructed based on the communication and memory of the past experience so that information processing, interpretations and quality of memory of the pre-defined questions vary among participants, e.g. people of different age or with different levels of cognitive abilities. Sometimes, participants’ resilience on specific questions may also lead to inaccuracy.

2. Cognitive engagement metrics/physiological measures

The second group of method is physiological measures, such as eye tracking, facial expression, and skin conductance etc. A physiological measure is usually used to evaluate the engagement in a specific task. For instance, when people are experiencing frequent mood swing from anxious to calm playing a horror game, the variation of skin conductance due to sweat can help capture the mood swing and evaluate the magnitude of emotional arousal. Besides, because physiological measures data collection can be done at the same time

Unlike self-report measures, physiological measures provide more objective data. It also can be carried out at the same time while the users are processing the task. Besides, physiological signals can provide detailed information about physical/ mental responses which we are not able to be aware of.

Although it seems physiological measures can provide fine-grained data as well as instant and direct measure of user attention, the meaning of the data acquired is not explicitly provided by the measures per se. At times it linearizes the complexity of user engagement enormously. As discussed above the skin conductance as the signal of emotional arousal, the anxious emotion is inferred from the data. Human’s cognitive and emotional states may contain multi-layer information and there are several models of people’s internal structure. [7] defines emotional states on three continuous spectrums including pleasure-displeasure, the degree of arousal, and dominance-submissiveness (Figure. 1). As shown in Figure 1, excitement, anger and distress might have the same level of emotional arousal while are different in pleasure and dominance dimensions. In this case, skin conductance might be ambiguous in conveying the pleasure of the experience. Physio-psychological reactions are not in a one-to-one mapping and chances are physiological measures cannot fully articulate the real experience in terms of meaningfulness. The lab setting also makes large-scale measurement impossible. Meanwhile, unfamiliar with the lab setting, participants might act differently than in their daily life so that the feedbacks are possibly misleading.

Figure 1. Russel’s three-factor emotional state model. [6]
The dimension of dominance-submissiveness is not shown in this diagram

3. Behavior metrics (web-analytics)

Web-analytics have been used to study users’ engagement with specific sites for a long time and there are a bunch of tools, models and case studies about behavior metrics. Google analytics, for instance, is one of the most popular web analytics tools for people to study users’ engagement such as click depth, duration, recency, loyalty etc. through tracking the page views, time spent on the sites, return user rate, the frequency of return etc. Lehmann et al proposed a model of behavior metrics in 2012 measuring user engagement in the scale of popularity, activity, and loyalty (Figure. 2). Lehmann’s study defined users as tourist (1 day per month), interested (2–4 days per month), average (5–8 days per month), active (9–15 days per month) and VIP (more than 16 days per month) according to the numbers of days per month a site is visited and mapped the different kinds of users on the three scales. The study also reveals the engagement in a time-based model to show the correlation between time and engagement.

Based on a large user scale, data collected from web-analytics, or say behavior metrics reveals a clear mapping and strong indication of the relationship between their behaviors and engagement: the higher the number, the more engagement. It also can easily provide multiple viewpoints through different types of data collected thus depict a comprehensive image of how people navigate, explore and use the site on different scales (especially time scale). The large volume of data also reduces the margin of error.

Figure 2. Metrics in the model of user engagement in popularity, activity, and loyalty [7]

Though the large-scale data analysis help to define a robust engagement pattern, behavior metrics has its own limitation. Based on the number of visits we may are able to know that people are willing to come to use the site but we don’t exactly understand what makes the site popular. Also, every time when a web-analysis is conducted, defining the benchmark of each group of data is required. The model of popularity, activity, and loyalty[6] gives a good example: no matter the number of visits or the dwell time, they are normalized data which means this kind of information doesn’t have meaning in itself. The meaning only shows up when the benchmark is clearly stated. In addition, each model can only illustrate one particular aspect of engagement (e.g. activity only show if users are active or not). To depict a thorough study of overall user engagement, different models (e.g. novelty, aesthetics, etc.), as well as different scales (e.g. gender-based factors, geographical factors, etc.), need to be established as well.

Conclusion

User engagement is always complex and multifaceted. A single standard metrics only illustrates limited aspects of the engagement [6] and somehow simplify the complexity of the mental states, behavior pattern and the overall experience one may have. Moreover, one metrics is never better than others, measurement of user engagement always depends on what is the aim: use self-report measures for a narrative, open-ended evaluation or specific description about personal experience, effect, interest, eye tracking, EEG or heart rate for evaluation of specific task and utility, and web-analytics for extracting general or specific behavior patterns. The approach of measuring user engagement, the setting of the environment, data collection and analysis should also vary according to different contexts of use, functions as well as different stages of product life cycles. For online website, skimming is the main browsing method while it doesn’t apply to mobile applications or VR interactive games. Thus, even though eye tracking is set up to test tasks in all the three kinds of environment, the criteria for interpreting the data needs to be carefully organized. Users’ behavior pattern and assessment may also change from the very beginning until they get familiar with the products/tasks. The novelty might at first satisfied their needs and fall off afterward. Tracking the dynamic user engagement needs constant changing of different combinations and settings of measures. It seems that it’s hard for us to illustrate a thorough examination of user engagement in every single detail. Nevertheless, seeing the interaction between users and the products as a situated and ever-changing action and then selecting the metrics according to our design/business goal will help us gain a better understanding of the present circumstance and provide constructive guidelines for gradual improvement.

Reference:

[1] FREE, A. U. I. A. S. Designing For User Engagement Aesthetic And Attractive User Interfaces Alistair Sutcliffe.

[2] O’Brien, H. L., & Toms, E. G. (2008). What is user engagement? A conceptual framework for defining user engagement with technology. Journal of the American society for Information Science and Technology, 59(6), 938–955.

[3] Webster, J., & Ho, H. (1997). Audience engagement in multimedia presentations. ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 28(2), 63–77.

[4] Burel, F., & Baudry, J. (1995). Social, aesthetic and ecological aspects of hedgerows in rural landscapes as a framework for greenways. Landscape and Urban Planning, 33(1–3), 327–340.

[5] Read, J. C., MacFarlane, S. J., & Casey, C. (2002, August). Endurability, engagement and expectations: Measuring children’s fun. In Interaction design and children (Vol. 2, pp. 1–23). Shaker Publishing Eindhoven.

[6] Russell, J. A., & Mehrabian, A. (1977). Evidence for a three-factor theory of emotions. Journal of research in Personality, 11(3), 273–294.

[7] Lehmann, J., Lalmas, M., Yom-Tov, E., & Dupret, G. (2012, July). Models of user engagement. In International Conference on User Modeling, Adaptation, and Personalization (pp. 164–175). Springer, Berlin, Heidelberg.

--

--