Airline Ticket Overbooking — Monte Carlo Simulation

Gaurang Swarge
5 min readMay 12, 2019

--

This is part of my series of documenting my small experiments using R or Python & solving Data Analysis / Data Science problems. These experiements might be redundant and may have been already written and blogged about by various people, but this is more of a personal diary and my personal learning process. And, during this process, i hope that i can engage and inspire anyone else who is going through the same process as mine. If a more knowledgeable person than me, stumbles upon this blog and thinks there is a much better way to do things or i have erred somewhere, please feel free to share the feedback and help not just me but everyone grow together as a community.

I recently read a simple and neat blog post by Mira Khare, Melanie Huynh, Arni Sturluson, Cory Simon, Link here .

Authors of the post, presented a way to optimise the no of tickets an airline should oversell to maximise their profits. They wrote the code in Julia, As i have been polishing my python skills, I thought it would be a good idea to code the same in python.

Airline business is a cut throat business, its one of those business where your supply side is limited (only 2 major aircraft manufacturers, Airbus and Boeing) and so is the client side (finite no of regular flying customers) plus you have rising fuel costs and stiff competition within the industry. The only differentiator for any airline operator from their competitor is the customer service that is provided by them to attract more customers from the finite pool.

Hence, Airlines operate on razor thin margins, it is imperative that they maximise their profits as much as possible from every channel.

When airlines operate a flight on a route, they incur a fixed cost to oeprate the flight, and the only way they achieve maximum profit is when they manage to fully book the flight. But as we all know that there are always certain %ge of people who cancel their flights last minute, or do not show up for their flights. In scenario like this airline is loosing a certain percentage of profits ( or a chance of booking more profits in cases of no shows by the customer) when they fly with empty seats.

So what should an airline do, they overbook the flights by a certain percentage or certain number, meaning they sell more tickets than the physical capacity of the flight, working on the assumption that certain customers are not going to make it for their flight due to various reasons.

There are certain time when their assumptions spectacularly fail, and all of the customers show up for their flights and the airline has to deal with the situation where some of the customers are not going to be on the flight because there just isnt any physical space left.

Now, as we said earlier, for an airline, customer service is a major differentiator from its competitor, hence it is extremely important for them to not to piss off enough customers in their quest of maximising profit, that they drive away their loyal customers to their competitors. To avoid that, airlines give away incentives in the form of credit vouchers or even cash, to people who are not allowed to board the flights.

The challenge in this case is to identify how many tickets can a airline oversell, in order to maximise their profits, even if they have to incur extra money for overselling.

This code can be used to identify the optimal tickets a airline can oversell to maximise the profits based on the costs involved if a customer has to be turned back on account of overselling.

So here is code below and Link to Github repo

For this problem, we will assume, we have a seating capacity of 100 for a flight & a probability of 94% for a person so show up for his/her flight.

Start with importing the libraries we will require.

import seaborn as sns
import matplotlib as mlt
import random as rd
import numpy as np
from matplotlib import rcParams
import matplotlib.pyplot as plt

Initialising our assumptions:

seat_capacity = 100;
prob_showup = 0.94;

Next we define a function to randomly return a boolean value, True or False if a person is going to show up or not based on our assumed probability

def show_up(prob_showup):
if rd.random() <= 0.94 :
return True; #person showed up
else:
return False; #person didnt show up

Next, we define a function to simulate a flight and to calculate out how many customers showed up for their flights

#Simulating the flight, to fig out total customers who showed up for a flight
def simulate_flight(tickets_sold,prob_showup):
n=0;
for i in range(1,tickets_sold):
if(show_up(prob_showup)):
n = n+1;

return n

Next, we define a function to calculate the total revenue that will be generated from the flight based on the total customers who show up for our flight.

here variable revenue_per_seat is assumed revenue an airlines will make if they sold that seat and variable voucher_cost is the cost that airline incurs when they have to turn away the customer due to overselling.

#simulating the net Revenue per flight
def simulate_net_revenue(tickets_sold, seat_capacity,prob_showup, revenue_per_seat, voucher_cost):
total_showups = simulate_flight(tickets_sold,prob_showup);
if (total_showups <= seat_capacity):
return revenue_per_seat * total_showups;
else:
upset_customers = total_showups - seat_capacity;
return (total_showups * revenue_per_seat) - (voucher_cost * upset_customers);

Next we initialise few more variables with our assumptions:

we assume revenue per seat to be $400, voucher_cost to be 2x the revenue_per_seat, we will make 10,000 simulations to determine best scenario and we are assuming we will allow 15% of maximum overselling based on the total 100 seats, ie. 15 seats.

We also initialise a revenue variable to store our simulation data.

revenue_per_seat = 400;
voucher_cost = revenue_per_seat * 2;
no_simulations = 10000;
max_overbooking = 20;
revenue = np.zeros(shape = (no_simulations,max_overbooking+1));

Final Simulation: We iterate10,000 simulations for each ticket oversold and calculate the net revenue for each simulation.

for tickets_overbooked in range(0,max_overbooking):
tickets_sold = seat_capacity + tickets_overbooked;
for i in range(1,no_simulations):
revenue[i,tickets_overbooked] = simulate_net_revenue(tickets_sold, seat_capacity,prob_showup, revenue_per_seat, voucher_cost);

Plotting the data, I have customised the y axis for a better visual presentation of the graph:

#Plotting the Simulation

sns.set();
sns.set(rc={'figure.figsize':(10,15)})
ax = sns.boxplot(data = revenue, notch=True);
plt.xlabel("No. of Tickets Oversold");
plt.ylabel("Net Profit")
plt.ylim(33000,40500);
plt.yticks([33000,35000,36000,37000,38000,39000,40000]);
Revenue per oversold ticket

As, we can see from the box plot above, when we oversell 0 tickets, revenue is less, but when we start overselling the tickets, revenue starts increasing, before it starts to fall down again. when we start to oversell more than 14 tickets, revenue generated is less than the revenue generated without overbooking.

From the graph we can infer that max. revenue is generated when an airline oversells about 6–9 tickets.

--

--

Gaurang Swarge

Ex-Entrepreneur, Data Scientist. love climbing, hiking and yoga