“Tesla shares tank after Elon Musk tweets the stock price is ‘too high’ ”, was one of a recent headline even after the previous court order requiring him to get a company lawyer’s approval before issuing any written communications regarding Tesla’s finances. In this article we look into scraping Elon Musk’s tweets and Tesla’s stock prices from Yahoo Finance followed by sentiment analysis and analyzing a relationship with the variation in Tesla’s stock price.
All the code and *.csv files can be found in my GitHub repository-https://github.com/ApurvaMisra/tweet_analysis.git and only snippets of the code are provided here.
Extracting tweets and sentiment analysis
GetOldTweets3 library was used to get his tweets from when he really started tweeting. Two advantages of using this library are-
- No requirement to create an app with twitter
- No limit on the number of tweets extracted for an individual
The tweets extracted can be saved into a *.csv file using “.to_csv” method in pandas.
For sentiment analysis, VADER(valence aware dictionary for sentiment reasoning), a rule-based library was utilized. It has a lexicon with scores attributed to each token. At the time of the writing it included 7500 words and their corresponding scores. It basically splits the sentence into words and finds the scores for each one of them to get the compound score. It works well for shorter sentences like tweets but as the sentence length grows sequence of words has a huge impact.
Extracting stock price
Beautiful soup along with selenium was utilized to get the Tesla share prices.
If we visit the Yahoo Finance website and look for ‘TSLA’, TESLA stock price would come up, since we require the prices from the year 2010 as that is when Elon Musk made his first tweet on the platform. We go to the ‘Historical Data’ tab and change the required time period. The web page consists of a table and when we right-click and select “inspect” it shows the HTML code, while scrolling through the code we will be able to find the “table” tag and the corresponding class as given below.
“.find” method is used to get the structure from within the table tag and similarly a structure of “tr” and “td” tags can be parsed to get the contents belonging to each row. The data from each row is stored in a list and written into a *.csv file.
Relationship between stock prices and sentiment
When Elon Musk posts a tweet, what we want to look for is the change in the close price from that day compared to the previous day. The plot for close stock price vs year along with the sentiment value of the tweets is given below. The stock price was scaled between [0,1] to have a comparison with the sentiment value which lies between [-1,1].
There is no distinguishable pattern that can be observed from the above figure but we can see the drop in stock price. Since, we just want to see if there is a correlation between the drop/increase in stock prices with the sentiments of his tweets we will plot the difference between close price from that day to the previous day.
There are some apparent places where drop in sentiment and stock price overlap. We will concentrate on the effect of only negative tweets on stock price which gives Pearson correlation coefficient of -0.22. Implying that a negative sentiment leads to an increase in stock price which goes against our hypothesis. This could be attributed to the fact that the sentiment values didn’t correspond well with the actual sentiment he tried to convey. For example the tweet “Tesla stock price is too high imo” was given a neutral sentiment value of 0 even though it should have been given a negative sentiment if the two-gram “too high” was taken into account and also tokens like “imo”, “lmao”, “lol” should be assigned a sentiment value which wasn’t the case with VADER.
In future work a more advanced sentiment analysis technique will be used to find the relationship.