Boarding Now: An Exploratory Look Into US Airline Sentiment Tweets

Published in

Social Media: Theories, Ethics, and Analytics

9 min readOct 8, 2020

Flying can be quite a stressful process. You book your ticket, pack your things, get to the airport, have a snack, board on time, depart and arrive on time with no issues on the plane. At least, this is what we hope. Sometimes it doesn’t go so smoothly.

Source: https://www.sciencemag.org/careers/2019/05/why-some-climate-scientists-are-saying-no-flying

A dataset I found on Kaggle.com is titled Twitter US Airline Sentiment: a collection of tweets in February 2015 about how Twitter users and passengers feel about their experience with a specific airline. From those tweets, there was a sentiment analysis conducted to categorize each tweet as positive, neutral, or negative. Furthermore, tweets classified as negative were also given a subcategory of the reason the tweet was negative (aka why the experience with that airline was bad), such as bad customer service or late flight. In this dataset are six different US Airlines: American, United, Delta, Southwest, Virgin America, and US Airways.

So the question I wanted to answer from the dataset is a rather broad one: how do Twitter users generally feel about airlines? Do all the airlines fare generally about the same online in terms of criticism? Or is there one that has received more criticism than the rest? If so, why is that? Do people have better experiences with one airline than another? This is purely an exploratory analysis of the dataset. I have also explored the relationships between these users, their sentiments, the airlines, and the negative reasons behind the less-than-ideal flight experience.

To explore some of the relationships in the data, I used Gephi, software for network analysis and visualization. The workspace for Gephi looks like this:

Network Analysis 1a)

The first part of this network analysis includes two networks that were created purely for a better understanding of the kind of data I was working with. The first is a cluster of each user and their associated tweet sentiment type (positive, negative, or neutral). This was created to get a sense of the distribution of users’ overall sentiments with US airlines. Users may also have had multiple tweets and therefore, potentially more than one sentiment type for all of their tweets. Below, Figure 2 shows seven different clusters, each node in its own color and the seven clusters easily defined.

Green: users with neutral tweets
Blue: users with positive tweets
Purple: users with negative tweets
Black: users with negative and neutral tweets
Orange: users with negative and positive tweets
Pink: users with neutral and positive tweets
Teal: users with negative, neutral, and positive tweets

Figure 2: Network of users and their sentiment review type (positive, negative, and neutral)

In these clusters, each node (except for three) represents one user. The three other nodes are the positive, negative, and neutral nodes that are in the middle of the green, blue, and purple clusters. The edges of the clusters represent the sentiment of the user’s tweets and connect to the positive, neutral, and/or negative nodes. There were a total of 7704 nodes and 9266 edges.

Figure 3: Histogram of negative, neutral, and positive tweets

Just from looking, we can see that the purple (negative) cluster is the largest, meaning that most of the tweets received by airlines had negative sentiments. A histogram (Figure 3) briefly created in Python also reflects this same observation even though the overlap of users’ tweets is not reflected in that figure. Additionally, I included Table 1 to include the percentages of the nodes in each cluster. Again, there being mostly negative tweets from users (purple nodes) is true in this table.

Most of the users’ tweets are classified as being part of one of the three outside clusters (green, blue, purple), but there is also some overlap. There is a handful of users whose tweets were all positive, neutral, or negative at least once. This may suggest that many users are giving multiple pieces of feedback to airlines and that something in their experience has changed between positive, neutral, and negative. It could also simply suggest multiple pieces of feedback with multiple different flight experience with different airlines.

Table 1: Closeness Centrality Cluster Percentages of Figure 1

To create this network in Gephi, I used the Yifan Hu properties (an algorithm ideal for visualizing large networks) in the Layout tab of the workspace. With an optimal distance of 200.0, the nodes were much more spaced out for easier viewing. The Network Diameter statistic (undirected) was also run to obtain further statistics. The average path length between edges was 2.888. Finally, under the Appearance tab, and then Partition, I chose to use Closeness Centrality to help locate the most central nodes. It’s evident in this network that the most central nodes are the teal cluster that is the most central, likely due to the fact that the users posted tweets that were all positive, negative, and negative. The process of using Yifan Hu properties, running the Network Diameter statistic, and using Closeness Centrality to shape the network is repeated for all of the following networks.

Network Analysis 1b)

The next figure (Figure 4) is simply the users’ tweets classified by the US airline type. This means that each colored cluster in Figure 4 represents a user’s tweet and the associated airline it was directed towards.

Figure 4: Network of users’ tweets by airline type | Figure 5: Cluster of users in between American and US Airways

In this network, there are six clusters, primarily serving as the six airlines the tweets were collected from in the dataset. The cluster colors are defined below.
Green: Southwest
Blue: United
Purple: Virgin America
Yellow: Delta
Red: US Airways
Pink: American

Figure 6: Histogram of tweets by airline

Each node represents a user and the edges represent the connection to each airline cluster. There were 7709 nodes and 7920 edges. The average path length between edges was 3.598. The purpose of this network is to see which airlines were being tweeted by users, aka which airlines had the most customers/which users use which airlines.

Table 2: Closeness Centrality Cluster Percentages of Figure 4

It can be seen from just the appearance that the blue cluster (United) is the largest, potentially indicating that most of the tweets from users were directed at United Airlines, regardless of the sentiment type. Figure 6, a histogram of the tweets by airlines, confirms this. Table 2 with the cluster percentages confirms this, too. Virgin America, (purple), is also seen as the smallest cluster and is also confirmed to have the smallest number of tweets. Evidently, there is some overlap with there being nodes in the middle of the graph, indicating that users have used more than one airline. Additionally, if you look closer in Figure 5, there are a handful of users between the US Airways and American clusters. This is a noticeable overlap of users between 2 airlines compared to other airlines. The use of the closeness centrality appearance allows us to see that while there are several users in the center of the network that use multiple airlines that the majority of these tweets from users were only directed at one particular airline.

Network Analysis 2)

The last two networks are to determine what kind of feedback each airline is being tweeted. Figure 7 displays the network of the airlines and which kinds of sentiments they received from tweets directed at them. Figure 8 displays the network of reasons for negative tweets and which airlines are associated with those tweets. To make these networks, I additionally used the Ranking feature under Edges in the Appearance tab to get a better sense of how many more tweets were aimed at which airlines. Using Weight as the Ranking factor, I was able to see which airlines received the most negative tweets and which issues users had with them the most. If an airline node points to a sentiment or negative reason node with a larger arrow, it had more tweets associated with that node. So for United Airlines in Figure 7, that airline had a large number of negative tweets directed at them. I also decided to use a color gradient of red (fewer tweets) to yellow to blue (most tweets) as an additional factor for determining the number of tweets associated with which airlines. With the previous example, since the United airline arrow pointing at negative is blue, that also means there were more negative tweets directed at United. This would also mean that the red and smaller arrows pointing at the positive and neutral nodes have fewer tweets.

Large/blue edges: more tweets
Small/red edges: less tweets
Orange, yellow, green/medium edges: somewhere in between

From Figure 7, we can conclude that United and US Airways had more negative tweets than positive/neutral tweets and more negative tweets than other airlines. Delta and Virgin America had very little positive, neutral, and negative tweets, potentially meaning they’re “better” airlines or just that the sample size of those airlines’ tweets was too small to gauge a sense of what consumers think of them.

Figure 7: Network of the tweet sentiments and airlines | Figure 8: Network of the negative reasons of tweets and associated airlines

As for Figure 8, the same process for creating it was the same as Figure 7. The primary issue for US airlines with consumers was Customer Service as opposed to other negative reasons. American, US Airways, and United in particular suffer from this issue with users. In green, medium arrows, United and US Airways had issues from users associated with Late Flights. It should be noted that each airline has had some tweets directed at them with each of the negative reasons, but some airlines suffer from these negative tweets more than others. Overall, however, Customer Service appears to be the underlying issue that US Airlines may want to improve.

Limitations and Other Issues

As with any analysis, there are bound to be limitations and more things to analyze within.

Upon looking at the data, there are tweets directed at JetBlue but JetBlue is not identified in the airline category as one of the US airlines. It is identified as Delta. There are only 2 tweets that are directed as a Delta twitter account (@deltaassist) and all other tweets associated with JetBlue are categorized as Delta. While not an issue with mixing up data entries, it is still an inaccuracy in the data.
Gephi offers many Layout algorithm options, statistics to run, and other algorithms to adjust the appearance of the network by nodes and/or edges. Not all of these were explored in this analysis. It is possible that the networks I created were not designed to be the best appearance-wise and there could be a better way to create them.
This dataset of tweets directed at airlines may not be a true representation of how consumers generally feel about airlines. As this dataset was from 2015, public opinions about these airlines may have changed.
There was not currently a way for me to analyze how specific users feel about the different airlines they’ve tweeted (or rather something I did not analyze here). If a user has tweeted both United and American, did they like one better than the other?
It’s possible that negative tweets from users aren’t all true criticism of something the airline has done and is instead slander, meaning it is not a true critique of their experience with that airline.

Final Thoughts

Tweeting can be an efficient way for US airlines to retrieve feedback as it is an additional and more direct way of contact. However, the vast world of social media also means users may not get a response. So from this set of tweets, how do Twitter users generally feel about airlines?

Most users have flown with United and had negative sentiments toward that airline.
Customer Service is a common issue among several airlines.
The group of users who have flown with both American and US Airways similarly experienced Customer Service issues.
While Delta and Virgin America had fewer users who directed tweets at them, they also had less negative tweets and issues than other airlines according to consumers.
Most users directed a tweet at an airline if there was an issue they had.

Should airlines be doing better according to these results? Maybe, but oftentimes we only give feedback when there are problems as opposed to when everything goes smoothly. If your flight goes as planned, what’s the likelihood you will tweet your airline, “Thanks for a good flight!”? My guess is very low. Still, considering that Customer Service appears to be a larger issue in these negative tweets than other issues, perhaps the competing US airlines should consider this something to improve on.