The Missing Piece for Deployable Behavior Modeling

Published in

ACM UbiComp/ISWC 2023

4 min readJul 14, 2023

Unveiling the Potential and Challenges in Cross-Dataset Generalization

Co-authors: Xin Liu, Han Zhang, Weichen Wang, Subigya Nepal, Yasaman Sefidgar, Woosuk Seo, Kevin S. Kuehn, Jeremy F. Huckins, Margaret E. Morris, Paula S. Nurius, Eve A. Riskin, Shwetak Patel, Tim Althoff, Andrew Campbell, Anind K. Dey, and Jennifer Mankoff

In the rapidly evolving field of technology, the ability to accurately model and predict human behavior is of paramount importance. The study “GLOBEM: Cross-Dataset Generalization of Longitudinal Human Behavior Modeling” [link] by Xuhai Xu and his team represents a significant stride in this direction. This research focuses on the application of longitudinal human behavior modeling, shedding light on the potential of sensing technology and the long-standing challenge of cross-dataset generalization. We are making efforts toward deployable behavior modeling algorithms, but there is still a long way to go.

The Potential of Sensing Technology in Behavior Modeling

The ubiquity of digital devices in our daily lives provides an unprecedented opportunity to monitor and understand human behavior. These devices, acting as extensions of ourselves, capture a wealth of data that can be harnessed to predict and understand various aspects of human behavior. In this study, the researchers leveraged this data, combining the efforts of two research groups across two institutes, each with two years of data. They established four datasets with a set of consistent features, re-implemented nine prior behavior detection methods, and built eight recent domain generalization algorithms. Furthermore, they proposed two new methods to enhance generalizability.

It is noteworthy that a part of the datasets, GLOBEM, is open-sourced at https://the-globem.github.io/

GLOBEM Dataset and Platform

Insights into Behavior Modeling: The Case of Depression Detection

One of the key applications of behavior modeling is in the field of health care, particularly in mental health. The study revealed that individuals with depression exhibited certain behavioral patterns. They tended to use their phones more frequently, indicative of difficulties in concentration, a common symptom of depression. They also spent more time at home, were less physically active, and had a more consistent mobility routine. These behaviors align with the diagnostic criteria of depression, which include a reduction in physical movement.

Interestingly, the study also found that individuals with a high depression scale score visited fewer uncommon places and showed a stronger repetitive pattern in their locomotion trajectories. This lack of novelty seeking could be a sign of diminished interest in other activities, another common symptom of depression. Fig 1 highlights these findings.

Fig 1. Each data type’s top features with consistent coefficients of linear mixed effect models between the feature value and depression labels across all datasets. Red indicates negative coefficients and blue indicates positive coefficients.

The Challenge of Cross-Dataset Generalization

While these findings are promising, the study also underscored the significant challenge of cross-dataset generalization. The performance of prior depression detection models varied significantly across different datasets. This variation indicates that a feature that can effectively detect certain behaviors in one dataset may become less informative in another.

The Performance Gap between Model Results on Our New Datasets v.s. Reported Results in Prior Work. The dashed line indicates naive majority baseline. All methods experience a performance drop on new datasets — Fig 2: The Performance Gap between Model Results on Our New Datasets v.s. Reported Results in Prior Work. The dashed line indicates naive majority baseline.

This challenge has been a long-standing issue in the community, highlighting the need for models that can generalize across different datasets and individual behavior differences.

The Promise of A New Method — Reorder

To address this challenge, the researchers proposed a new method, Reorder. The novelty of Reorder lies in its ability to leverage the continuity of behavior trajectory, which is inspired by the behavior science finding that people’s behavior tends to be continuous. The method involves a new multi-task learning model with a new pretext task called the reordering puzzle. In this task, the temporal order of the feature matrix is shuffled, and the model is trained to reconstruct the original sequence. This process is jointly optimized with the main task of behavior detection. Fig 3 shows the model architecture.

Fig 3. The Design of Reorder Compared to Empirical Risk Minimization (ERM, the basic deep learning model design).

The Reorder method showed promising results, outperforming other models by at least 3.4% on ROC AUC (6.3% relative advantage), and 3.2% on absolute balanced accuracy (6.2% relative advantage), both with statistical significance. This improvement illustrates that learning the temporal continuity of behavior trajectory can enhance the model’s generalizability.

Conclusion

The study by Xuhai Xu and his team represents a significant advancement in the field of behavior modeling. It not only demonstrates the potential of sensing technology in understanding and predicting human behavior but also brings to light the critical challenge of cross-dataset generalization. As we move forward, it is crucial to address this challenge to fully harness the power of technology in transforming various fields, including but not limited to healthcare.