05/12 — Final Report “Analysis of President Trump’s Tweets”

KJ Huang
Visualization@SBU
Published in
9 min readMay 13, 2020
Poster for Project

Introduction

With the rise of social media, such as Facebook and Twitter, which creates real-time information delivery and strong social networking, have received significant attention from the public and allowed us to better develop our understanding of the influence of social media on daily life.

According to many studies, we also discovered the huge impact of social media on the economy. Through reshaping the news landscape and revolting the power of news dissemination, social media becomes an unremarkable factor when discussing the financial market. Among them, opinion leaders play an important role. They used to have lots of followers and reviews which could transfer to the huge amount of voice on social media and caused influence on the related field.

The most famous story is how Mr. Trump utilized Twitter to influence the 2016 US election. With the features of Twitter, real-time and short, Mr. Trump’s team successfully reached their goal of connecting with voters. Obsessed with the flow of information on social media, Twitter has become one of the potent weapons for Mr. Trump to win the election.

Therefore, we tried to imagine if Mr. Trump could bring those effects on election, maybe there could be other similar impacts on different areas.

Allocating problems with datasets

As we all know, President Trump tweets frequently on Twitter. We believe that these messages will influence the country in many ways, especially on the economy. To increase the diversification of our research, we decided to analyze the influence by President Trump from different financial aspects.

First of all, stock markets occupy the principal role in the economy. As a result, we develop one of our problems as “Could President Trumps’ tweets bring impact on the stocks’ trend”. In our research, two main stock markets indexes would be taken into account, S&P500, Nasdaq Composite and Dow Jones Industrial Average. Tracking the 500 largest U.S companies, S&P 500 could be the most accurate quantifier of the U.S. economy by reflecting the biggest companies risks and returns. To improve the credibility and reliability of our research, we will also make use of the Nasdaq Composite and Dow Jones Industrial Average. We believed they would create some similar results from President Trumps’ tweet.

Apart from the stock market, the US dollar also plays an important role in the currency markets. Once understanding that there are no practical boundaries for social media, we know that it’s interesting to analyze the connection between them. Therefore, our second question would try to find the relationship between currency with President Trump tweets.

To diversify our research, we will take other datasets into account which may be influenced by President Trump’s tweets, like oil price and housing price, etc. During the research, we will mainly focus on his presidential period, and hope to conceive some clues and relationships between his words and datasets to support our assumption.

Backend approach for the project

Process of data preprocessing for the tweets

Fig. 1
Table. 1

From Fig. 1, we showed how to calculate the sentiment Score using tweets which input data is shown in Table. 1. First, we will load the CSV file as a dataframe. The while loop will help us loop until the end of the file. Then, we started to remove the stop words using the NLTK library and recompile the string with regular expressions. Because we found some of the content may contain URL inside which will influence the sentiment score, for this case we also removed the URL. Finally, we applied in-build from a python library called TextBlob to calculate Sentiment for us. The result will be shown in the Fig. 2

Fig. 2

Applying K-Mean, PCA and MDS on the data

Fig. 3
Table. 2

We are going to demonstrate how we apply data reduction tools on our datasets. Before using those tools, we merged our datasets in advance which is shown in the Table. 2. From Fig. 3, we uniform the date format first which will contribute to the merge process. We applied the elbow method to find the optimal K for k-mean, Fig. 4. We found the k equals to 5 is the optimal K. Therefore, we utilized the value to find the best Principal Component Number.

Fig. 4
Fig. 5

From the Fig. 5, we could find that using the variance as the y axis, the component number with 4 is the best one. Now, we could further apply the Multidimensional Scaling.

We used two if/else conditions to judge the type of requirement. After calculating all the results, we will send the json data to our front-end server.

Frontend approach for the project

Fig. 6 Dashboard Layout

To implement our dashboard layout, we use different kinds of visualization methods that we have learned from the class to display the charts and components, and the frontend is using only d3.js library to draw all these charts or plots. However, to preprocess our dataset when handling user actions or to format the datetime string when initiating our x-axis date array, I also import lodash.js and moment.js into our project just to make our data processing easier.

As Fig.6 shows, our dashboard can mainly be separated into two parts, the left includes one stacked bar chart and one line chart, both of them are sharing the same x-axis which represents the date of data. The right part includes a message display layout and one scatterplot for more insight analysis, we will start to explain more detail of these components.

For the stacked bar chart, this is designed to visualize the messages that President Trump has tweeted before. To let the user be more convenient to select any tweet they prefer, each block of a bar is one of the messages from that date, and to emphasize the importance of each tweet, we use different colors to fill up the block that depend on one of the properties of tweet, which can be changed by user from the top right selector. We will later explain the events registered in each block.

For the combined line chart, it contains various datasets that we believe will be influenced by the President’s tweet messages, they are oil price, currency value, house price, stock price, and average sentiment score of the date. We combine all datasets in one chart to see how these data relate to each other, and we can find some message from it when all the paths raise or drop together. The user can choose to dismiss or re-enable any line path they want by clicking the label in the legend block at the right bottom corner. Additionally, all the dots on the line path and also any block in the above stacked bar chart has registered some user events, when user mouseover them, we will display a tooltip that display the exact value of that data point for line chart, and display a tooltip with message content for bar chart, and when user click on them, we will display the detail of the date.

The detail will popup on the top right layout of our dashboard, the tweet message will only be displayed when the user clicks on the block of the bar chart, since there possible will have multiple tweets on the same date. This is a kind of detail on demand approach that lets the user can view all exact values on one click and to compare them with others.

And the last component of our dashboard is the bottom right corner, which shows a scatterplot of our clustered result. We provide MDS of euclidean distance, MDS of correlation distance, and PCA, are trying to explore how our dataset is related to each other, the user can choose to see different cluster cases on this plot by changing the value of the selector.

Fig. 7 Comparing Trending — 1
Fig. 8 Comparing Trending — 2
Fig. 9 Comparing Trending — 4
Fig. 10 Comparing Trending — 3

Conclusion

In our proposal, we extrapolated that Trump’s tweets will bring a huge impact with the US economy. However, the truth turns out to be different than we have imagined. In Fig.7, we demonstrated a sentiment score with the stock market index, S&P500, Nasdaq Composite and Dow Jones Industrial Average. We observed the data from 2015 to 2020. The result is surprising that part of the tweets will have an effect on the stock market when Trump’s tweets are related to the stock market or economy. On the other hand, when the tweets are not related to the stock market, they will not show any relationship.

In Fig. 8, housing data shows almost nothing related to Trump’s tweets. We thought there were two reasons. First, most of Trump’s tweets are so divergent that most of the information won’t directly cause influence on housing prices. The type of the data may be another reason for this trend. Comparing the tweets were daily data, housing price data we have applied was monthly data. It’s hard to find the correct response with postponement of the data.

Some positive relationships were shown in Fig. 9 and 10. Apart from other datasets, currency and oil price have more obvious global property. Therefore, the impact from tweets of Trump will be amplified. We also discovered that aside from the real-time effect, most of the results showed a delay response in the following two or three days. Due to the prosperity of the social media, the impact will take some time to show the effects. Some of the research also pointed out that the effect of tweets with negative sentiment scores tends to be faster than the positive one.

To sum up, different from our exception, Trump’s tweets didn’t cause lots of effects on those datasets. However, we still believed that Trump’s tweets would be a good factor when people want to observe the influence from social media.

Future Work

Different from using Trump’s tweets as a main criterion, we would take different factors into account. In the way of calculating sentiment score, the emoji score should be involved. Some research had discussed that people used to express their true emotion behind those icons.

In the stock market data, we could use the data with a cash tag ($), which is mainly focused on the stock market. With those changes, we could better observe the impacts Trump has brought to our world.

DEMO

REFERENCE

[1] Ge, Qi, Alexander Kurov, and Marketa Halova Wolfe. “Stock market reactions to presidential social media usage: Evidence from company-specific tweets.” SSRN Electronic Journal (2017).

[2] Dredze, Mark, et al. “How twitter is changing the nature of financial news discovery.” proceedings of the second international workshop on data science for macro-modeling. 2016.

[3] Shear, Michael D., et al. “How Trump Reshaped the Presidency in Over 11,000 Tweets.” The New York Times (2019).

[4] Bryden, John, and Eric Silverman. “Underlying socio-political processes behind the 2016 US election.” PloS one 14.4 (2019).

[5] Smailović, Jasmina, et al. “Predictive sentiment analysis of tweets: A stock market application.” International Workshop on Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data. Springer, Berlin, Heidelberg, 2013.

[6] He, Wu, et al. “Social media-based forecasting: A case study of tweets and stock prices in the financial services industry.” Journal of Organizational and End User Computing (JOEUC)28.2 (2016): 74–91.

[7] Grieves, Jason A., et al. “Emoji for Text Predictions.” U.S. Patent Application №14/045,461.

[8] Kreis, Ramona. “The “tweet politics” of President Trump.” Journal of Language and Politics 16.4 (2017): 607–618.

[9] Oborne, Peter, and Tom Roberts. How Trump thinks: His tweets and the birth of a new political language. Head of Zeus Ltd, 2017.

[10] Colonescu, Constantin. “The Effects of Donald Trump’s Tweets on US Financial and Foreign Exchange Markets.” Athens Journal of Business & Economics 4.4 (2018): 375–388.

[11] Sprenger, Timm O., et al. “Tweets and trades: The information content of stock microblogs.” European Financial Management20.5 (2014): 926–957.

[12] Simpson, Michael. “Do President Trump’s Tweets Increase Uncertainty in the US Economy?.” (2018).

--

--