Airline on-time performance

Pranay Sawant
8 min readSep 7, 2021

--

image credit: https://cdn.businesstraveller.com/wp-content/uploads/fly-images/816144/iStock-498532108-916x517.jpg

Objective:

To extract insights from data which consist of flight arrival and departure details for all commercial flights within USA for year 2009 and build basic model to predict flight delays.

We will try to find below answers.

  • Which carrier performs better?
  • When is the best time of hour/day of week/date of month/month of the year to fly to minimize delays?
  • Do older planes suffer more delays?
  • Can you detect cascading failures as delays in one airport create delays in others? Are there critical links in the system? Show it clearly from specific date and impacts perspective.
  • Create a model to predict flight delays, and provide accuracy details.

Data Acquisition:

Data is collect from here.

The data consists of flight arrival and departure details for all commercial flights within the USA, from October 1987 to April 2008. This is a large dataset: there are nearly 120 million records in total, and takes up 1.6 gigabytes of space compressed and 12 gigabytes when uncompressed.

Mapping to ML Problem:

  • Extract all insight using EDA
  • Build classifier model to whether predict flight will be delay or not?

Performance metric:

  • confusion metrics
  • precision, recall

Data Information:

For an experiment purpose, we have consider only year 2009 data; which is around 6M rows and 110 columns.

Below is basic table structure for our data.

Some columns has missing values those are as below:

It is confirmed that there is no duplicates rows in dataset.

Data pre-processing and Understanding:

I have removed all the null values columns; where null % more than 99%.

Average arrival delay time is : 11.644 (min)

google searched avg delayed flight

So it is normal that to be late by 12.4 minutes. So as a round off we will consider if flight is less than 15 minutes late as “ON TIME”. But more than 15 minutes as DELAYED.

Pie chart for all flights

Total number of flights: 6450285

  • There are Arrival flights 5127157 was On-time.( 79.49% )
  • There are Arrival flights 1218288 was Delayed. ( 18.89 %)
  • There are Arrival flights 89377 was Cancelled ( 1.39 %)
  • There are Arrival flights 15463 was Diverted flights ( 0.24 ) %

Let’s try to find why flights got cancelled?

Total number of Cancelled flights: 89377

  • There are Arrival flights 37680 was Cancelled, Because of Bad Weather ( 42.16% )
  • There are Arrival flights 36364 was Cancelled, Because of carrier issue ( 40.69 %)
  • There are Arrival flights 15313 was Cancelled, Because of NAS ( 17.13 %)
  • There are Arrival flights 20 was Cancelled, Because of Security ( 0.02 ) %

Carrier Delay:

Carrier delay is within the control of the air carrier. Examples of occurrences that may determine carrier delay are: aircraft cleaning, aircraft damage, awaiting the arrival of connecting passengers or crew, baggage, bird strike, cargo loading, catering, computer, outage-carrier equipment, crew legality (pilot or attendant rest), damage by hazardous goods, engineering inspection, fueling, handling disabled passengers, late crew, lavatory servicing, maintenance, over sales, potable water servicing, removal of unruly passenger, slow boarding or seating, stowing carry-on baggage, weight and balance delays.

Late Arrival Delay:

Arrival delay at an airport due to the late arrival of the same aircraft at a previous airport. The ripple effect of an earlier delay at downstream airports is referred to as delay propagation.

NAS Delay:

Delay that is within the control of the National Airspace System (NAS) may include: non-extreme weather conditions, airport operations, heavy traffic volume, air traffic control, etc. Delays that occur after Actual Gate Out are usually attributed to the NAS and are also reported through OPSNET.

Security Delay:

Security delay is caused by evacuation of a terminal or concourse, re-boarding of aircraft because of security breach, inoperative screening equipment and/or long lines in excess of 29 minutes at screening areas.

Weather Delay:

Weather delay is caused by extreme or hazardous weather conditions that are forecasted or manifest themselves on point of departure, enroute, or on point of arrival.

So our first question is

Which carrier performs better?

Flight cancelled due to carrier delay.

flights cancelled due to carrier delay

So it is clear cut that WN,AA,UA carrier’s flights got CANCELLED due to carrier delay. Whereas “HA” flight rarely CANCELLED.

Flight cancelled due to whether delay.

Whether delayed flights are due to bad weather, it is something that we cannot control.

Flight cancelled due to NAS Delay:

As we can see “MQ”,” “XE” are most affected due to NAS delay.

Flight delay due to Security reasons:

Overall graph for cancelled flights:

all cancelled flights

As a customer, you never wants flights to get CANCELLED for any reason.

B6, F9, HA, FL are those flights ,which are not very often get CANCELLED.

Now let’s see how many flights are not CANCELLED but DELAYED.

WN and AA are those flights got delayed multiple times. “HA”, “F9”, “A5” are those got delayed rarely.

When is the best time of hour/day of week/date of month/month of the year to fly to minimize delays?

In which month flights delayed the most?

It is clear cut that in 11th month ie November month flights delayed is minimum and in 12th Month ie December it is maximum.

all month and all flight delayed in each month
Monthly analysis
daily DELAY analysis (in mean)

We can roughly say, “Saturday” and “Tuesday” is good time to travel and 24th, 25th and 4th,5th,6th,7th, 8th and 9th is good time to travel.

Do older planes suffer more delays?

Using this limited data, we cannot comment on this, because there is no specific column which speak about aircraft is OLDER or NEWER. But if we can get more data in this direction we can definitely explore the same aspect.

Can you detect cascading failures as delays in one airport create delays in others? Are there critical links in the system? Show it clearly from specific date and impacts perspective.

Yes. It is genuine reason which impact a lot in the system.

Please note below data.

Flight ‘87099E’ has to depart at 1503 (3.03 PM) ; but it is departed at 1614 (4.14 PM). It is 71 minutes delayed.

It is flew from CVG to DTW. Now it supposed to reached at 1623 (4.23 PM), but reached at 1723 ( 5.23 PM).

Now let see what happened at DTW airport?

At DTW airport flight ‘85059E’ depart late. It suppose to depart at 1723 (5.23 PM) but actually it is departed at 1753 (5.53 PM) 30 minutes delayed. Similarly we can observe same ‘88379E’ delayed by 48 minutes.

And the more we dig in, the more we can get such data.

Create a model to predict flight delays, and provide accuracy details.

yes. we can create a model which predict whether flight will be on time or not?

Data pre-processing for building model.

  1. Remove all null values where all rows are empty.
  2. Removed similar looking columns. for example: OP_UNIQUE_CARRIER, OP_CARRIER, OP_CARRIER_AIRLINE_ID. all features having same values. So instead using DUPLICATE features (column) we can ignore them. Likewise removed all similar behavior features.
  3. Filling missing values in ‘CARRIER_DELAY’, ‘WEATHER_DELAY’, ‘NAS_DELAY’, ‘SECURITY_DELAY’,
    ‘LATE_AIRCRAFT_DELAY’. features
  4. Removed those feature whose has single state. for example, YEAR,FLIGHT feature has only one single value.
  5. Make TRAIN (64 % ), Validation (16%) and TEST(20%) data splits.

There are two type of feature set in this dataset; One Categorical features and others are numerical features.

We have to encode categorical features. There are multiple approach to this ONE-HOT-encoding is one of the strategy for the same. But instead of one-hot encoding we have used RESPONSE CODING features.

After converting into response coding; for numerical feature we can normalized between (-1,1) range. Then standardize numerical features.

After that we can concatenate categorical features and numerical features.

Feature / Columns used for model building are as below:

where ARRR_DL15 will consider as output column.

Model:

We have tried logistic regression model and Random Forest classifier model.

Performed hyper parameter tuning on Logistic regression and Random Forest. Out of them Random forest model outperformed.

Below is Logistic Regression performance.

Logistic Regression performance

below is Random Forest performance.

random forest performance

Model performance:

As train loss, validation loss and test loss are not far from each other, model is definitely not OVERFITTING. It is decent classifier. But in real world data, we may face some more genuine and challenging issue and we can improve its performance in real time.

Summary:

We have explored Airplane on time dataset in possible way, try to find best carrier, try to find best time to travel and “cascading failure problem”.

We have built basic low level model using Random Forest Classifier and logistic regression, out of that Random Forest does outperformed.

Future Scope:

Live weather data could help in real time prediction. As well Airport size and Airport capacity data could improve real time performance.

Thank you…!!!

--

--

Pranay Sawant

I am the one who wants to deep dive into the AI (Artificial Intelligence ) ocean.