Twitter Sentiment Analysis for Stock Market prediction

Sara Morcillo
4 min readFeb 8, 2023

--

How does your opinion affect the market?

Photo by Alexander Sinn

All publicly known information about a company, which includes OPINIONS, would considerably affect the current stock price. Since social networks are so present in our daily lives, some platforms such as Twitter, have become a variable to consider. In this research, we will predict Tesla's stock price based on public opinions about Tesla on Twitter.

In 4 steps, I created a model for the stock market prediction using Twitter API (Tweppy), NLP (Tex Blow and Vader), Granger Causality, and Random Forest Modeling.

  1. Data Collection using Tweepy

The first step was to get a Twitter account as a Developer to be able to use the Twitter API, Tweepy. I recommend you follow the steps in this guide to get it: https://dev.to/twitterdev/a-comprehensive-guide-for-using-the-twitter-api-v2-using-tweepy-in-python-15d9. Good luck!

Thanks to this guide, I got the Developer account but I could not get the researchers’ account, so we had certain limitations. Finally, I was able to have these two databases:

  • Stock Market TESLA (daily open-close): 232 rows (Jan-Dec 2022)
  • Global Mentions of TESLA #: 153.000 rows (April — Dec. 2022)

To be precise, the stock market database can be easily generated without accessing any API, there are also some examples available on specific platforms for this purpose.

2. NLP

The next step was to apply Sentiment Analysis to our Database ‘Global Mentions of TESLA’. For this, I used the existing libraries Tex Blow and Vader, which categorize our tweets into different sentiments. These powerful libraries also calculate the intensity of this sentiment with the famous compound metric. This is a measure, where 0 means neutral, below (from 0 to -1) will be a negative sentiment, and above (from 0 to 1) will be positive. Very simple, but effective! Here, there is an example of how Vader compound works on a tweet with a Tesla opinion.

Sentimental Analysis using Vader library

‘Behavioral economics tells us that emotions can profoundly affect individual behavior and decision-making’. Johan Bollen.

See here an example of a visualization of Tesla’s public opinion in some countries according to certain dates, where -0.785 is a bad opinion, and 0.785 it is a good one.

TESLA opinion on Twitter (between the 3rd. and 11th of December 2022)

To simplify the process of using the API and making the sentiment Analysis, here there is the code function we have created. In some lines, I asked the Twitter API for the last tweet with the Keyword: ‘tesla’, and its compound. You will not only access the text of the Tweet, but also the date of creation, id, and location if it is described.

Image by author

3. Granger Causality

Once I had our sentimental analysis of tweets based on Tesla's Opinion, I proceeded to check for the correlation between the Tesla stock and moods for the same dates in the period of April 2022 to November 2022. For this proposal, I used Granger Causality.

The Granger Causality test is used to determine whether or not one time series is useful for forecasting another. Fortunately, the results were positive as I observed a certain relationship, so I continued with the investigation. Specifically, I detected that there is Granger causality between stock change (up or down) and compound value of tweets with 4 days lag.

Image by author
Correlate Tesla Stock change to Twitter Tesla mood

4. Random Forest Classifier

The final step was to create a classification model. In particular, I used the Random Forest Classifier. Based on our model, we were able to determine whether the stock price would go up or down in the next 4 days. I determined that Tesla’s stock price will increase from day 4 of this study (11th. December 2022), which means that there will be a rise in the share price by that date.

The main purpose of this study was to gain insight into behavioral economics through data analysis. There are no commercial interests or affiliations. The main source of reference was the work of Professor Johan Bollen, namely the paper: Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of computational science 2, 1 (2011).

Check out GitHub or contact me to know more about this project! https://github.com/mgmsara/Final_Project

I hope you enjoyed this read. Follow me if you are interested in the value of my content!

--

--