Tending Unmarked Graves: Automatic Identification of Post-mortem Content on Social Media

Aaron Jiang
ACM CSCW
Published in
3 min readOct 19, 2018

This blog post summarizes a paper about machine learning identification of post-mortem social media content that will be presented at the 21st ACM Conference on Computer-Supported Cooperative Work and Social Computing.

December 24th, 2014. The day before Christmas. Eric Meyer was checking his Facebook, as usual, and saw something in his News Feed. It was Facebook’s Year in Review — a feature that algorithmically generates a video with highlights from the previous year. People use this feature to share video compilations of vacations, parties, and birthdays, all accompanied by joyful and upbeat music.

The video awaiting Mr. Meyer, however, prominently featured the death of his six-year-old daughter.

Eric Meyer’s encounter with post-mortem content was nothing short of “cruel.” Image source: Eric Meyer.

It wasn’t just Eric Meyer who had an upsetting experience with post-mortem content. There have been numerous stories of social media platforms reminding people to get in touch with their dead friends, or to wish them a happy birthday. These problems call for ways to handle death sensitively and compassionately on social media platforms. Platforms may want to provide people with support during difficult times, but in order to do so, platforms need to have ways to detect mortality.

Determining if someone has died on social media, however, is more difficult than it might seem. Identifying post-mortem accounts remains a manual process — most platforms rely on survivors to report the deaths of users. Not only is manual reporting labor intensive and emotionally taxing, depending on survivors to report deaths results in delayed, inconsistent, and ultimately unreliable data.

To tackle this problem, in my work with Dr. Jed Brubaker, we developed a machine learning-based method to automatically identify both post-mortem social media profiles and memorial comments. We trained multiple machine learning classifiers on 870,326 comments from 2,688 public profiles on MySpace, using bag-of-words features as well as style, topic, and sentiment measures, and compared their performance.

Our best-performing machine learning classifiers achieved an F1 score of 0.882 on identifying post-mortem profiles, and 0.865 on identifying memorial comments. In other words, not only are we able to detect profiles of deceased users with high accuracy, we can do the same on individual social media comments without the context of the profiles they are posted to, which is often how these comments appear in social media feeds.

Our classifiers can also detect mortality quickly with minimal linguistic signals. We can determine 19.9% of the post-mortem profiles with just the first post-mortem comment, 52.3% after the first four, and 90.1% after the first nineteen. In our dataset, this result means that we were able to classify 31.8% of the post-mortem profiles on the day of death, 52.3% within the first day after death, and 90.1% within ten days after death.

Our classifier can detect mortality quickly with minimal linguistic signals.

Our machine learning system marks a first step toward identifying mortality at scale in social media contexts, and shows promising directions to using such systems to make social media platforms more compassionate and caring. However, we also caution against simplistic or out-of-the-box use of a mortality and memorial content classifiers such as ours. While our classifiers had good performance, it is important to look beyond the overall performance and instead focus on specific types of precision. For example, in social media platforms, the accuracy of identification of deceased users should be prioritized over covering as many possibly-deceased users as possible, as false positives — identifying living users as deceased — are much more distressing than vice versa. We recommend researchers and practitioners in all areas consider mortality in their work, but this includes thinking carefully about the contexts in which automatic classifiers are being used, and the ways these classifications will shape the interactions they design.

Citation:

Jialun “Aaron” Jiang and Jed R. Brubaker. 2018. Tending Unmarked Graves: Classification of Post-mortem Content on Social Media. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 81 (November 2018), 19 pages. https://doi.org/10.1145/3274350

If you have questions or comments about this study, email Aaron Jiang at aaron [dot] jiang [at] colorado [dot] edu.

--

--

Aaron Jiang
ACM CSCW

Computational social scientist. Gamer. Have some opinions about online content moderation. https://aaronjiang.me