Donald Trump VS Joe Biden?
After the presidential debates, who people mention the most: Donald Trump or Joe Biden?
The United States presidential election is coming very soon. This November we will make a decision on which candidate will become the next president. The presidential debate showed us both candidates’ views on real-world problems and their plan forward. Although there were many mixed opinions, many people turned to social media to display their thoughts and views on who most people are interested in or mentioned on social media platforms.
On the Reddit platform, there are over 33.3k members were present on a Donald trump Reddit community (r/donaldtrump) and over 52.6k members on a JoeBiden community (r/joebiden). In order to get an answer from the data, I surveyed Reddit to find an answer. For this dataset, I obtained feedback after the debate to see who people mention the most: Donald Trump or Joe Biden? The most mentioned subreddits were focused on each party’s support groups. This is important in determining which candidate has the strongest support. The data extracted from Reddit illustrates the relationship between groups. This study focuses on the most popular subreddit in election2020 hashtags (#election2020) community which contain r/joebiden, r/donaldtrump, r/politicalmemes, r/thedavidkmanshow, and etc.
Now it’s time to see how the relationship or connection works in this community.
The data from Reddit with PRAW
1,907 newest subreddit (October 3) with an election2020 hashtag (#election2020) were collected with the Python Reddit API Wrapper (PRAW). To see the relation or network on the election 2020 community, First, we will clean and prepare data sets as nodes and edges, which Nodes are the Reddit accounts (Redditors) who participated with an election2020 hashtag on the Reddit platform (Who mention and being mentioned) and Edges are the action or connection that Reddit accounts had within an election2020 hashtag on Reddit platform.
Data set with the Gephi
I have imported 1,011 nodes and 1,719 edges from the spreadsheet files to the Gephi program. As mentioned, Nodes are the Reddit accounts (r/Joebiden, r/donaldtrump, r/politicalmemes, r/thedavidkmanshow, etc) and Edges are the connection between Reddit accounts (post /share /comment in #election2020).
← — — — — — — — —
The picture shows the collaboration graph as the initial relation graph in this community. We can see there are 3 crowded clusters which seems like they were mentioned the most or have a strong connection in this community.
← — — — — — — — —
Yifan Hu Algorithm
As the initial relation graph seems hard to understand. We can use the Yifan Hu layout to make them look easier to read. Now we can the relation between the pair of nodes in the picture.
Also, there is a filter that we can use to visualize the network of this data set to present the degree range, neighbor network, betweenness centrality, closeness centrality, modularity class, and other attributes in the Gephi library.
Graph characterization
As illustrated by the below graph represented in blue is the Reddit account who supported and mentioned Joe Biden (highest degree centrality) versus Donald Trump in orange, and in green represented the account who participate with #election2020. From 1,011 nodes, it’s only 3 nodes that connect to both communities, recall as a bridge node. This means there are only 3 Reddit accounts mentioned and support both Donald Trump and Joe Biden within the election 2020 hashtag as shown in the picture below.
The statistics of this network shows as the picture which contains:
- The average degree is 1.00 means the average of the number of edges that connect to a node in this network is 1.
- The average weighted degree is 7.65.
- The network diameter is 4 means the longest of the shortest path in this network is 4.
- The network density is 0.002 means the number of actual edges over the potential edges in this network is 4.
- The modularity in this network is 0.53
- The average path length is 2.818 means the average of the shortest path in this network is 2.818.
Centrality
The picture shows the node with the most of the shortest paths having the highest closeness centrality which means it can reach the most nodes quickly. Also shows the node that appears most often in the shortest paths has the highest betweenness centrality which means it has many paths that must flow through the node.
As illustrated by the graph below represented in blue is how many people supported and mentioned Joe Biden versus Donald Trump in orange. Joe Biden was mentioned more often in subreddits leading one to believe he was the most talked about during the debate as opposed to Donald Trump.
Discussion and limitation
This study is only collected subreddit from 1,011 Reddit accounts from September 30 — October 3, 2020. It is a small sample to tell who is the most mentioned or supported to be the president in the future. Also, This community has newer data and changes every day according to the social media platform. These very limited results and outcomes based on people’s opinions only during the week of the experiment.
Also, for the ethical concerns related to this experiment, we can’t make a conclusion or infer on a certain group of people whether who is going to be the president in the future. There are a lot of people who do not have social accounts, even they do, there are some people who do not post what they are thinking or who are they supporting due to the election or voting is too sensitive to express their opinion in the public.