AI vs. Momentum: Predicting Stock Prices Using Social Media Sentiment

accentedge
Nov 19, 2020 · 9 min read

As the corporate landscape evolves and social media continues to impact the business world, at accentedge we decided to do a study using Artificial Intelligence (AI) to see if we could understand the relationship between a company’s stock price and the public’s perception of a company as defined by the social media “sentiment” of that company on Twitter.

To see if we could find a relationship between stock prices compared to sentiment, we used AI technology to analyze the rate of change in a company’s stock price based on sentiment using an analysis of Tweets mentioning the company. By correlating the company stock price and sentiment on Twitter, we were able to see patterns that indicate social media sentiment can be used as a tool to evaluate the public’s perception of a brand. You can watch the full video at this link “Predicting Stock Prices Using Social Media Sentiment”.

Existing Research

Existing Research

To determine if social media sentiment can be related to stock price trends we started by looking at earlier studies that indicate there is a correlation between Twitter sentiments and stock price trends.

The research shows that Social Media trends are perhaps the most important repository of public sentiment. A strong correlation exists between the rise or fall in stock prices of a company to the public opinion or emotions about that company expressed on Twitter through Tweets.

There is also evidence of causation between public sentiment and stock market moments, in terms of the relationship between mood (based on the average daily mood on Twitter) and the closing price. In addition, we see that the sentiment polarity of Twitter peaks implies the direction of cumulative abnormal returns.

Our Model

To study the stock price return we used the price rate of change; a company’s stock closing price one day minus the stock closing price the previous day divided by the stock closing price the previous day. The price rate of change is the proportion by which a company’s stock price increases or decreases throughout the day.

Applying Machine Learning

Our input features used the following input data points on a given day:

  • The number of positive Tweets issued today
  • The number of negative Tweets issued today
  • The number of positive Tweets issued yesterday
  • The number of negative Tweets issued yesterday
  • Today’s stock price rate of change
Machine Learning Model

With Machine Learning we can take these input features and train the model. The input features can use millions of data points to create a model that learns to forecast what the target variable will be.

Control Model

We compare our model with the Control Model which is essentially the same process minus the Twitter sentiments. This is important because if we receive good results from our model, we want to make sure that the Twitter sentiment was creating meaningful information. If the Control Model performs better than our model then it means Twitter sentiments do not correlate with stock price rate of change.

Tools Used In Our Model

Twitterscraper is a Python script available free online. It is a solid option for quickly collecting a large number of Tweets. We mined our Tweets based on hashtags and cashtags. Cashtags is introduced by Twitter a few years ago. If a user of Twitter directly wants to talk about the financial situation of a company or the stock they use cashtag ($). This data is important for our specific model because it guarantees that people are talking about stocks directly.

Amazon Sagemaker is Amazon Web Service’s integrated development environment. It hosts server instances on which we can train pre-implement Machine Learning algorithms. We designed our model using the linear learner tools logistic regression algorithm.

Textblob is a natural language tool that is available free online. Textblob is a Python library for processing textual data that provides a simple API for diving into common Natural Language Processing (NLP) tasks. Textblob is used to analyze the polarity of Tweets, whether they are positive or negative in the sentiment.

To analyze our Tweets we created a Textblob object which is a specific type of Python object. We then pass the object to the Textblob and it will output a number. This number ranges from -1 to +1 and that is the polarity of the Tweet. The more negative the number the more Tweet is negative.

IEX Developer Platform is a web-based API supplying quoting and trading data. It allows you to access the stock price of a company in real-time. Since we were more interested in historical stock data we used the IEX finance Python module to access stock closing prices of Apple and Tesla over a two-year period from 2016 to 2018.

Training the Machine Learning Model

Our model data is organized on trading days. For each trading day, we have one data point. Each of these data points includes five input features that help predict the target price. Assigning the algorithm’s input features and the target variable is the process of training used in this Machine Learning model.

The goal in Machine Learning training is once we input enough of these input features the model learns to be able to predict the target variable. Let’s say we are using the model today we can mine today’s Twitter sentiments, yesterday’s Twitter sentiments, and today’s price Rate of Change (ROC). Once we have a solid and robust model we can input these features and the model will tell us what it thinks tomorrow’s price ROC will be.

For our model, we used historic stock data from 2016 to 2018. For three-quarters of that period, we used the data to train our model and for the final quarter of the period, we tested the model.

Model Results

Tesla 270,000 Tweets Model

Tesla 560,000 Tweets Model

Tesla 1.2M Tweets Model

Apple 1.7M Tweets Model

Apple 2.7M Tweets Model

How Do We Improve Our Model?

While we saw some interesting results with our Machine Learning model, there is more work needed to see if we can generate a predictable correlation between social media sentiment and actual stock prices.

One area to improve is to continue to train our model to improve its ability to do sentiment analysis. One way to do this is by improving the way text is preprocessed, for example, by looking at different spellings, abbreviations, or emoticons that are used and assigning sentiment to those indicators.

Next, we would like to explore other types of Machine Learning algorithms. So far, we don’t know if the relationship can even be modeled linearly.

Another area we would like to try is using a social media model on top of a quantitative trading model to see what kind of results that would produce. If we use our sentiment analysis on top of these models perhaps it will have more grounding on the current stock price trends and the sentiment analysis can add some pertinent information that might improve these models.

Conclusion

We have also investigated how varying levels of Tweets affect the accuracy of the models. Our model was extremely inaccurate on a lower number of Tweets. But once we increased the number of Tweets our model intercepted the control model accuracy. That means the more Twitter sentiments we are able to mine, the better our model can be — suggesting that Twitter sentiments can be an important factor in helping predict stock prices.

Our results show that negative and positive Tweets of the public carry a strong cause-effect relationship with price movements of individual stocks. While sentiment analysis might not be that useful on its own — perhaps the best use case for it would be to help quantitative traders to get an edge, as a tool that their competitors don’t have with only momentum analysis. We know one of the reasons why stock prices follow a random walk with very little accuracy is because stocks are so acutely affected by the news. How can we tap into the news? The best way is through social media sentiment.

We believe that with further development this tool can be used to evaluate the public’s perception of a brand — and thereby better predict a stock price utilizing consumer sentiment.

For further details visit our website https://www.accentedge.com/

About the author: Ali Saeed used his summer internship at accentedge to research “AI vs. Momentum: Predicting Stock Prices Using Social Media Sentiment.” Saeed is from suburban Chicago and is currently a student majoring in Computer Science at Stanford University.

Artificial Intelligence vs Momentum

Predicting Stock Prices Using Social Media Sentiment

Artificial Intelligence vs Momentum

As the corporate landscape evolves and social media continues to impact the business world, at accentedge we decided to do a study using Artificial Intelligence (AI) to see if we could understand the relationship between a company’s stock price and the public’s perception of a

accentedge

Written by

Artificial Intelligence vs Momentum

As the corporate landscape evolves and social media continues to impact the business world, at accentedge we decided to do a study using Artificial Intelligence (AI) to see if we could understand the relationship between a company’s stock price and the public’s perception of a

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store