Strategies for customer satisfaction enhancement of Southeast Airlines

Eunmi(Ellie) Jeong
7 min readMay 19, 2020

Data analysis and strategy formulation

  1. Project overview

NPS (Net Promotor Score) is customer response on a scale of 1–10 to the question, “How likely is it that you will recommend our airline to a friend or colleague?”. If respondents score less than 7, they’re detractors. If they scored above an 8, they’re promoters. In the middle range (a score of 7 or 8), then they’re passives. In a given group, subtracting the percent of respondents who are detractors from the percent of respondents who were promoters provides the overall NPS score. The concept of NPS is that customers who are promoters are good customers to keep. Such customers may sometimes even provide free ‘word of mouth’ advertising. Customers who are detractors are really problematic in that they may actively tell their social connections not to use the product or service. The customer survey dataset which contains 10,282 rows and 32 columns displays various data related with NPS. Each row captures characteristics of the customers and flights which might be related to the NPS. The ultimate goal of this analysis is to identify major causes of customer satisfaction and dissatisfaction, then develop actionable insights which can lower customer churn of Southeast Airlines.

2. Business questions

I listed up 4 business questions and explored dataset to answer the questions.

1). What are the main factors which make promotors and detractors?

2). Which are the best and worst partner airlines in terms of customer satisfaction?

3). What are the good and bad performing routes?

4). Are there any notable features in customers’ spending patterns related to the flight?

3. Exploratory data analysis

1). Data munging

I cleaned the data before performing data analysis. First of all, I updated the column names by replacing all the dots(‘.’) with under bar(‘_’). Also, I changed the row names to ‘NULL’ so that the row numbers could be nicely displayed in ascending order. Also, I looked into the structure of dataset and I found that many columns have missing values indicated as ‘NA’. Therefore, I replaced the ‘NA’s with reasonable values.

2). Text mining

Next, I did text mining on customer’s comments and created word cloud using ‘quanteda’ package. Among promotors’ comments, terms like ‘good’, ‘service’, ‘seat’, ‘time’ appeared most frequently. For detractors’ comments, ‘delay’, ‘service’, ‘time’, ‘luggage’ were the most frequent terms. From text mining, I could have some sense of factors which are related to customer satisfaction and dissatisfaction. However, the terms are not insightful enough to identify major characteristics.

3). Summarizing variables

For better understanding about the dataset, I investigated each customer attributes further. As a first step, I created a histogram for each numeric variable and looked into the shape of the histogram. I could understand the distribution of customers’ demographic features, expenditure status and Southeast Airlines’ flight operational status. Also, the histogram of ‘Likelihood_to_recommend’ indicates that the customer satisfaction level is quite high as large population is clustered around 8–10.

For the next step, I compared the NPS of different variables to figure out what kind of customers and flight condition leads to promotors and detractors.

To sum up my analysis about NPS of different variables, I would categorize several attributes which could be related to customer satisfaction.

  • Promotors (positive for NPS) : silver airline status, middle-aged(40–60 years old), male, business travel, business class, Cool & Young Airlines partner company, long distance flight.
  • Detractors (negative for NPS) : blue airline status, senior(over 60 years old), female, personal travel, eco-plus class, FlyFast Airlines partner company, moderate distance flight.

4). Predictive modeling

  • Association rules mining

At this stage, I implemented association rules mining to figure out major attributes which causes customers to be promotors and detractors. The ruleset1 is association rules for detractors and ruleset2 is for promotors. I reviewed top 10 rulesets sorted by lift in ascending order and identified important factors which appear in the rulesets most frequently and consistently. The result mostly matches with that I figured out by comparing the NPS of different variables.

  • Detractors : blue airline status, female, personal travel, eco class, senior, FlyFast Airways, origin from Texas
  • Promotors : silver airline status, business travel, middle-aged
  • Support Vector Machine (SVM)

SVM is an effective tool to define a robust prediction model. Through linear modeling and association rules mining, I could find out several attributes which affect the customers’ satisfaction more significantly than other attributes in the dataset. The quality of the data in these attributes should be tested before we confirm that these are meaningful attributes. Therefore, I employed SVM model to analyze the quality of the data in selected attributes. Provided that the trained model works well, the algorithm can be used to predict the right outcome most of the time in the test data.

I used 3 major variables, ‘Airline_Status’, ‘Type_of_Travel’, and ‘Class’ for algorithm training. To run the SVM model, I converted the categorical variables into discrete numerical variables. Then, I applied different ‘C’ parameters to see how training error and cross-validation error changes. As show below, there’s no big difference in error rate. Overall, training error is 0.17 and cross-validation error is also 0.17, which means prediction accuracy is expected to be about 83%. For checking, I applied this model to the test dataset and generated prediction. As a result, I could get 84% prediction accuracy(error rate: 399/2421=16%, prediction accuracy:100–16=84%) which represents 84% reliability on the outcome of prediction using this trained model.

5). Mapping routes

I could find out some regional characteristics from association rules mining. According to the result, customers who traveled from Texas were correlated to low level of likelihood to recommend. For further investigation, I tried mapping flight routes of detractors using ‘ggplot2’ and ‘ggrepel’ packages. First of all, I filtered out rows of detractors who actually took flights (no flight cancellation). Then I arranged the data in ascending order of ‘Likelihoold_to_recommned’ and sliced top 10 rows to focus on the routes related to the lowest customer satisfaction. Based on the data, I generated a map which displays the most dissatisfactory 10 flight routes. According to the below map, we can see that flights which departed from Texas and Georgia (origin location is blue dot, and destination location is red dot) are negatively evaluated by customers.

On the contrary, the routes of high satisfaction are Illinois-Wisconsin, and mostly internal flight within Colorado, Georgia and California.

4. Analysis and interpretation

Through the exploratory analysis, I could figure out the important facts which can answer the business questions.

1). What are the main factors which make promotors and detractors?

  • Detractors : blue airline status, female, personal travel, eco class, senior, FlyFast Airways, origin from Texas, Georgia
  • Promotors : silver airline status, male, business travel, middle-aged

2). What are the best and worst partner airlines in terms of customer satisfaction?

  • Best : Cool&Young Airlines Inc. Highest NPS but the number of flights is not so many, which is 115 out of 10,282, only 1% of total flights. Among promotors’ flight, Cheapseats Airlines Inc. accounts for the biggest portion even though its NPS is lower than that of Cool&Yong Airlines Inc.
  • Worst : FlyFast Airways Inc. Lowest NPS. Many flights depart from Texas and Georgia are operated by this company.

3). What are the good and bad performing routes?

  • Best : Illinois-Wisconsin, and mostly internal flight within Colorado, Georgia and California.
  • Worst : Flights which depart from Texas and Georgia

4). Are there any notable features in customers’ spending patterns related to the flight?

I could find that female and blue airline status customers account for a large portion of shopping amounts at the airport.

5. Actionable insights and recommendations

As the data analysis summary suggests, senior, female, personal traveler, and blue airline status customers are likely to be detractors, while middle-aged, male, business traveler, and silver airline status customers are likely to have high level of satisfaction as promotors. Also, flights which depart from Texas and Georgia shows very low customer satisfaction, so some kind of actions are required to address this problem. Here are my recommendations for Southeast Airlines to enhance their customer satisfaction and reduce customer churn considering the insights about notable customer segments.

1). Improve facilities and customer services for female and senior citizens.

2). Improve services and provide various promotions for customers who are on personal travel in blue airline status.

3). Investigate the business operations of FlyFast Airways Inc and other partner airline companies partnering up in Texas and Georgia. If necessary, consider replacing current partner companies with others which show high level of customer satisfaction such as Cool&Young Airlines Inc.

4). Offer shopping discount coupons or promotional deals for female and blue airlines status customers in cooperation with tax-free shops to provide better experience of traveling with Southeast Airlines.

--

--