Recognizing Mental Health Disorders on Social Media using Machine Learning

Published in

Analytics Vidhya

8 min readOct 17, 2020

90% of people who commit suicide are suffering from a mental health disorder, and over 50% are depressed. Yet most mental illnesses are very treatable, and over 90% of depression patients who were treated responded positively with reduced symptoms. So why are so many people still committing suicide due to these problems if it is that easy to treat? Because about half of the people who experience mental health problems don’t receive any treatment at all.

Why don’t people receive treatment if it is so effective?

Of those people who didn’t receive mental health treatment, 72% acknowledged they did need treatment, but still did not initiate in it, mainly due to the factors of cost, time commitment, and confidentiality. The other 28% of people mostly thought that they either could handle it, or that treatment wouldn’t help. The other main cause of not receiving treatment is misdiagnosis. On average, half of the patients who are depressed and are screened for depression, are not diagnosed with it, often due to inherent biases. Lack of treatment due to any of these reasons vastly increases the risk of suicide, self-harm, and worsening of the person’s condition.

How can we treat more people?

In essence, we use a machine learning algorithm to detect social media users with mental health disorders and connect them with therapists whom they can get treatment from through our anonymized user interface (UI).

Using Machine Learning to Recognize Mental Health Disorders

All of these stated causes of untreated mental health problems can be vastly reduced using Artificial Intelligence on social media. A machine learning algorithm can recognize mental health disorders like depression as well, and sometimes better than current doctors and psychiatrists. Through a supervised learning algorithm that uses natural language processing, the algorithm is able to understand patterns in how people with and without mental illnesses talk and write, as studies have shown that mental health patients often have slight changes in personality and in how they communicate. The algorithm would mainly use NLP processing on text, in combination with a neural net that uses other data that Twitter gives out, such as how often someone posts and what times.

Twitter

On a social media site like Twitter, people often have anonymous accounts or pseudonyms, and people also post much more often. This allows people to speak out and just vent about how they truly feel without having to worry about how others will think of them. With more posting and more data, it is also easier for the algorithm to pick up on signs of mental health problems and make more accurate predictions.

Twitter API

Many social media platforms like Twitter, Reddit, Facebook, and Instagram have APIs (application programming interface), that allow people to access data from their platform, although each has their own restrictions on what data they give out.

Our algorithm would first be implemented on Twitter, whose API gives out a lot of good information. This information includes the ability to search for all tweets based on keywords, as well as have access to all posts from a user in the last 7 days. Although it is not a database of all tweets made on the site, studies from other machine learning research have shown that almost all people with mental health disorders use self-related words like “me”, “my”, and “I” very commonly. There are other common phrases as well, but the occurrence of these words far outnumbered any others.

The algorithm would be applied to all tweets with these keywords, as almost all people with mental disorders would have at least one tweet with the keywords present, and once there are possible signs of mental disorder, the API allows you to look at other user tweets whether they have the keyword or not.

How can we treat people virtually?

When the algorithm does find signs of mental illness, an automated message would be sent to the person who is flagged, helping them receive treatment. The person would be connected with our UI which would allow them to see many specialized therapists and some information about them, including their pricing. This concept would get rid of some of the main reasons for people not getting treatment; time commitment, and confidentiality. It would also help with the problem of cost. Furthermore, if the support is right there, it might encourage people who think they can handle themselves to talk to a therapist anyway. The mental health disorders we would be recognizing and treating are depression and anxiety, as well as people close to committing suicide — this is closely related to severe cases of depression.

To get therapists offering support online, we would partner with some mental health clinics or even unions of mental health therapists and psychiatrists. This would benefit them as well, as if they want, they can support as many people as possible, most likely many more people than just at their office. They could also do this at home online, and still get paid at the rate which they choose, which could be a great way to supplement their income while still doing what they are good at and enjoy, just online. The therapists can choose their rate, and it will be shown on the UI, so people who want cheaper support can find therapists and psychiatrists offering cheaper rates.

User Interface

The UI would also allow people to say if they do not have a mental health problem, or if they have a different problem than what was predicted. If someone does have a mental health problem and doesn’t get recognized by the algorithm, they can still use our UI to find therapists who they want to connect to without having to create an account or share personal information. We would encourage these users to share their social media accounts, although it would not be required. This would allow our algorithm to continuously improve with all the labeled data it is receiving, allowing many more people to get help.

The UI would have a home screen telling the user that they were recognized for a mental health disorder. The user would be able to then look for therapists sorted by which mental health disorder they specialize in. Each therapist would have the their contact info and rates so that the user can connect with the therapist who they want to.

Possible challenges with this concept

Privacy

One big problem with this system is privacy. No one knows exactly what data the algorithm finds most important when picking if someone has a mental health problem or not. However, many people may have concerns with a machine looking at want what they tweet and their activity data. We cannot stop the machine from using this data, but we can make sure that the info will be completely private and encrypted. If the AI flags someone for a mental health problem, an automated message will be sent to that person without any human having access to who has been flagged or not, connecting them to the UI. The UI will also be anonymized and people will not have to share any personal info in that stage if they do not want to.

Scale

Another possible issue is the large scale of this algorithm. Because there are 6,000 tweets a second, and 5 million tweets in a day, we would need computers powerful enough to apply the algorithm to even more than 6,000 tweets a second. This is because if a message is suspicious, the algorithm would also look into older posts made by the user within the last 7 days, allowing it to make a much more accurate prediction. However, Facebook is analyzing every post on its platform for suicide warnings using a machine learning algorithm, and it has 55,000 posts per second compared to Twitters’ 6,000, showing that it is feasible and doable for a machine learning algorithm of this scale to be implemented.

Accuracy

One more counterargument is the accuracy of the model. Of course, it will not be 100% right, and probably not even 75% right. However, the main goal is to identify as many positive people as possible, so the algorithm will be tuned toward doing that, even if it leads to more false positives. This is because we think it is more important to help the people who need it, instead of not sending as many messages to people that they can ignore anyways. Additionally, even though the accuracy seen in current studies has been about 60%, the same accuracy is seen in medical professionals. However, with all the data this would bring in addition to the rate at which AI is improving, there is lots of room for improvement with this number. Meanwhile, medical professionals will not suddenly get much smarter or remove their biases.

The 60% rate in machines is also more impactful than the 60% rate by medical professionals. About 60 million American adults suffer from a mental disorder each year, and most of those people are on social media. However, a majority of those people don’t get diagnosed or don’t get treatment. By using the algorithm on social media, it would diagnose over twice as many people, resulting in tens of millions of people globally getting diagnosed and having access to treatment.

How the algorithm can work most efficiently

Type of Algorithm

For the machine learning model to work most efficiently, the right algorithm — or combination of algorithms — needs to be used. Many studies building similar algorithms have used many different strategies, but some general principles are common and are effective in this situation. The overall structure is a neural network. For natural language processing of the text, recurrent neural networks and bayesian networks have both worked very well and could both be used. To combine that with other Twitter data like the time and number of tweets, a simple neural network can be used. The overall network will combine these results to return the probability of each different mental health disorder — depression, anxiety, and suicide. If the probability for any of those is greater than a certain threshold, the automated message will be sent to that user.

Type of Data

Current machine learning strategies can recognize patterns in language quite well, and can also recognize patterns in listed data like the time and number of tweets very well. Because of this, an algorithm can do a good job recognizing patterns in tweets, whether it is depression, anxiety, or suicide. Many social media sites including Twitter use pictures as well. On Twitter, because most visual media is not actually of the user or taken by the user themself, using that would not provide much more useful data, and could actually hurt the model’s performance.

Long-term Possibilities

In the future, this type of algorithm could be applied to many social media platforms, like Instagram, Facebook, and Reddit in addition to Twitter. Although Reddit may have the same problem with images as Twitter, Instagram and Facebook are both places where users often share personal photos along with captions. Some algorithms can recognize patterns in pictures quite well, and one study has proven that depression can be recognized through Instagram photos. On Reddit, a similar algorithm could be used as the Twitter algorithm.

If machine learning is applied to all these social media sites that over half of the world uses, it could have an enormous impact on improving mental health and preventing suicide around the globe. Additionally, with researchers and companies working on virtual therapists using AI, that system could replace the contacting of a real therapist, increasing the ease of getting treatment and reducing the cost. However, this possibility of AI seems much further away than the recognition of mental health disorders.

One more possibility is adding not only therapists and psychiatrists, but also people who have previously endured mental health problems themselves, and who want to help others going through similar things. Many people may want this support so that they have someone who can empathize with them even more.