I Know Your Intent: Graph-enhanced Intent-aware User Device Interaction Prediction via Contrastive Learning

Published in

ACM UbiComp/ISWC 2023

8 min readAug 15, 2023

Co-authors: Qingsong Zou, Qing Li, Dan Zhao, Kang Li, Zixuan Weng, Ruoyu Li, Yong Jiang

This post summarizes our paper “I Know Your Intent: Graph-enhanced Intent-aware User Device Interaction Prediction via Contrastive Learning” published in the September 2023 edition of the Proceedings of the ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT). We will present our paper at the UbiComp / ISWC 2023.

User Device Interaction in Smart Homes

With fast-evolving IoT solutions, the number of smart devices in homes has soared, expected to reach 5 billion by 2025 [1]. The emergence of cloud platforms also allows IoT sensors and actuators to better assist users in various home living activities. User Device Interaction (UDI), i.e., constant device controls, can reflect users’ behavioral habits and intents. UDI prediction for smart homes brings about opportunities from multiple perspectives.

For service providers, such as vendors, predicting users’ living behaviors through their device usage histories can offer insights for improving user experience.
From the perspective of device intelligence, prediction of users’ behaviors can help intelligent platforms recommend actions that users may like to perform, such as “turn off the bed light”, as shown in Fig 1.
From the perspective of user behavior analysis, accurate user behavior prediction can be used for abnormal user behavior identification, elderly/disabled care, or further user behavior analysis.

Challenges When Making Accurate User Device Interaction Prediction

We need to appropriately model the following three factors in user behaviors:

Routines. A routine contains people’s behavior habits. Mining the correlations of behaviors in routines is beneficial for UDI prediction. However, the existence of noise behaviors between the routine behaviors causes the model to learn false correlations between noisy behaviors and routine behaviors which co-occur in the same sequence. For example, in the following figure, “unlock the smart lock” and “open the network audio” are noisy behaviors, they are less correlated with former behaviors.
Intents. Intent inherently determines the user behaviors. There are multiple intents in user behavior sequence, the following figure shows an example of user sequential behavior with two intents: laundry (watervalve/washing machine/dryer) and cooking (oven/microwave/dish washer).

Multi-level Periodicity. The presence of multiple periodicities makes user behavior dynamic. As shown in the following figure, the user may work overtime and go to bed later than usual once every weekend, which shows week-level periodicity. As a mountaineering enthusiast, the user goes climbing at the end of each month and goes to bed early at night due to fatigue, which shows month-level periodicity.

Existing User Behavior Prediction Models

Traditional Model: HMM [2], FPMC [3]
CNN-based Model: Caser [4]
RNN-based Model: LSTM [5], CARNN [6], DeepMove [7], SIAR [8]
GNN-based Model: SR-GNN [9]
Transformer-based Model: SASRec [10], SmartSense [11], DeepUDI [12]

All these models can not capture routines, intents and multi-level periodicity in user behavior sequences at the same time.

Our Solution: SmartUDI

To consider user routines, intents and multi-level periodicity, we propose SmartUDI, depicted in the following figure. SmartUDI mainly consists of a Message-Passing-based Routine Extraction algorithm, an Intent-aware Capsule Graph Attention Network and a Cluster-based Historical Attention Mechanism.

Key components and ideas:

Message-Passing-based Routine Extraction. Noisy behaviors in the routines can cause the model to learn false correlations between behaviors. Therefore, we first perform routine extraction from UDI sequences by message passing and then apply the contrastive loss on behavior embedding to minimize the difference between behaviors within the same routine (positive samples) and maximize the difference between the behaviors derived from different routines (negative samples).
Intent-aware Capsule Graph Attention Network. For encoding multiple intents while considering the complex transitions between devices, the behavior sequence data is fed into behavior encoder consisting of Relational Gated Graph Attention Network (RGGAT) and Time2Vec [13] to learn behavior representation, then behavior representation are input to the intent-aware encoder to extract multi-intent representations of users for the sequence representations.
Cluster-based Historical Attention Mechanism. We propose to capture the multi-level periodicity from historical sequences. Specifically, k-means is applied to cluster behavior sequences and find semantically nearest historical sequences. Then attention mechanism is applied to aggregate the current sequence and the semantically nearest historical sequences representation to get the prediction vector for the final prediction.

SmartUDI Shows High Accuracy and Interpretability

We evaluate model performance using four real-world smart home datasets, three (FR/SP/US) from public datasets (https://github.com/snudatalab/SmartSense) and one anonymous dataset (AN) collected on our testbed. The datasets description is shown in the following table.

All datasets are split into training, validation and testing sets with a ratio of 7:1:2 according to the timestamps of the behaviors. We create sequential instances with a window of length 10. The first nine behaviors of the window are input of SmartUDI for predicting the next behavior, i.e., the 10th behavior.

SmartUDI demonstrates excellent UDI prediction performance. The following table shows that: 1) The proposed SmartUDI scheme outperforms all baselines; 2) the traditional models, HMM and FPMC, show the worst performance; 3) the CNN-based and RNN-based models outperform the traditional models; 4) SR-GNN outperforms RNN-based models; 5) Transformer-based models SASRec, SmartSense and DeepUDI achieve better performance than all other baselines. Nevertheless, their performance is still inferior to that of SmartUDI.

SmartUDI exhibits high interpretability. The following figure shows that the user’s intent for “sleep/get up” carries the highest weight, SmartUDI predicts the highest probability for the user behavior “turning off the lights.”

Discussion

There are some room for further improvement.

First, SmartUDI does not consider the impacts of environmental factors on human behaviors in smart homes, such as temperature, humidity, and brightness. Environmental factors play important role in UDI prediction. If the temperature in the room is high, the user is likely to turn down the temperature of the air conditioner. If it is dark indoor (e.g., a cloudy day), the user may turn on lights even in the morning.
Second, SmartUDI considers the behavior of the current user when making UDI predictions. There may be other users in the smart home, and the behaviors of different users may affect each other.

Considering environmental factors and modeling different users behaviors can improve the performance of SmartUDI.

Conclusion

We propose SmartUDI, a novel user device interaction prediction framework, which achieves accurate UDI prediction by considering routine, intent, and multi-level periodicity of user behaviors. First, to extract routine behaviors from behavior sequences with noise, we propose Message-Passing-based Routine Extraction (MPRE) Algorithm and apply the contrastive loss to minimize the difference between behaviors within the same routine and maximize the difference between the behaviors derived from different routines. Second, we propose an Intent-aware Capsule Graph Attention network (ICGAT), which consists of a relational gated graph attention network and a capsule network to learn user multi-intent representations. Third, we propose a Cluster-based Historical Attention Mechanism (CHAM) to learn the multi-level periodicity of user behaviors efficiently from semantically nearest history sequences.

References

[1] Knud Lasse Lueth. 2018. State of the IoT 2018: Number of IoT devices now at 7B - Market accelerating. https://iot-analytics.com/stateof-the-iot-update-q1-q2-2018-number-of-iot-devices-now-7b/.
[2] Sean R Eddy. 1996. Hidden markov models. Current opinion in structural biology 6, 3 (1996), 361–365.
[3] Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW). ACM, Raleigh, NC, USA, 811–820.
[4] Qiang Liu, Shu Wu, Diyi Wang, Zhaokang Li, and Liang Wang. 2016. Context-aware sequential recommendation. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, Barcelona, Spain, 1053–1058.
[5] Qingsong Zou, Qing Li, Ruoyu Li, Yucheng Huang, Gareth Tyson, Jingyu Xiao, and Yong Jiang. 2023. IoTBeholder: A Privacy Snooping Attack on User Habitual Behaviors from Smart Home Wi-Fi Traffic. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 7, 1 (2023), 1–26.
[6] Qiang Liu, Shu Wu, Diyi Wang, Zhaokang Li, and Liang Wang. 2016. Context-aware sequential recommendation. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, Barcelona, Spain, 1053–1058.
[7] Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, and Depeng Jin. 2018. Deepmove: Predicting human mobility with attentional recurrent networks. In Proceedings of the 27th International Conference on World Wide Web (WWW). ACM, Lyon, 1459–1468.
[8] Lakshmanan Rakkappan and Vaibhav Rajan. 2019. Context-aware sequential recommendations withstacked recurrent neural networks. In Proceedings of the 28th International Conference on World Wide Web (WWW). ACM, San Francisco, 3172–3178.
[9] Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, and Tieniu Tan. 2019. Session-based recommendation with graph neural networks. In Proceedings of the AAAI conference on Artificial Intelligence, Vol. 33. AAAI, Hilton Hawaiian Village, Honolulu, Hawaii, USA, 346–353.
[10] Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In 2018 IEEE 18th International Conference on Data Mining (ICDM). IEEE, Singapore, 197–206.
[11] Hyunsik Jeon, Jongjin Kim, Hoyoung Yoon, Jaeri Lee, and U Kang. 2022. Accurate Action Recommendation for Smart Home via Two-Level Encoders and Commonsense Knowledge. In Proceedings of the 31th ACM International Conference on Information & Knowledge Management (CIKM). ACM, Atlanta, Georgia, USA, 1–10.
[12] Jingyu Xiao, Qingsong Zou, Qing Li, Dan Zhao, Kang Li, Wenxin Tang, Runjie Zhou, and Yong Jiang. 2023. User Device Interaction Prediction via Relational Gated Graph Attention Network and Intent-aware Encoder. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (AAMAS). 1634–1642.
[13] Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, and Marcus Brubaker. 2019. Time2vec: Learning a vector representation of time. arXiv preprint arXiv:1907.05321 (2019).

To gain further insights, we invite you to explore our comprehensive paper titled “I Know Your Intent: Graph-enhanced Intent-aware User Device Interaction Prediction via Contrastive Learning.” Additionally, we look forward to engaging in discussions with you at the upcoming UbiComp / ISWC 2023UbiComp / ISWC 2023 conference.