The Drunk Passenger Problem
If something happens, it always happens for a first time and a last time.
In my Random Walks class, I first learnt about this — there is always a first time you visit something and a last time, as being put together in the quote above. A bit philosophical way to start the conversation for a problem, but I hope you will find the statement relevant in your life as well as in this problem. For new readers, I have recently started a new series where I bring interesting probability puzzles and their solutions. You can also contribute your problems or solutions to this series, by reaching me out on LinkedIn, Twitter, Instagram or Email.
The problem goes like this…
A line of 100 airline passengers are waiting to board a plane. They each hold a ticket to one of the 100 seats on that flight. For convenience, let’s say that the n-th passenger in line has a ticket for the seat number n. Being drunk, the first person in line picks a random seat (equally likely for each seat). All of the other passengers are sober, and will go to their proper seats unless it is already occupied; In that case, they will randomly choose a free seat. You’re person number 100. What is the probability that you end up in your seat (i.e., seat #100) ?
Hmm.. interesting situation, but the only problem is there are so many things to handle at once and keep track. I will provide two different ways to approach this problem and what inspired me to do so.
- There are 100 people! Too many to handle. Let’s get down to simpler tractable numbers and observe if any pattern sprouts from these cases. As goes with induction, they are tend to be bit calculative at times but are very good choice when the numbers are pretty large in a problem.
- Good old, conditional probabilities. There is a big catch here though, if you start with a basic conditioning, as I will show in detail, there is a chance you will get stuck and curse me :P
Before we go ahead with any of these methods. As we do in each problem, we introduce some notations to formalize the discussion ahead.
Let’s start with the conditioning method. Now, as we usually tend to do, we will try to condition on something we wished we knew i.e. the seat on which the drunk man might have sat down. We call this event Di as described below and use it in tandem with Law of Total Probability (LOTP) to simplify our problem into seemingly smaller chunks.
Now, with this big confusion out of picture i.e. where the drunk man sat, we can start analyzing the conditional probabilities. Now, it is obvious that if the drunk man sat on his own seat i.e. #1, then every other person will take his/her own seat and eventually I can sit on my own seat. Otherwise if he sat on my seat then I will end up on sitting on seat #1 (why? because every other person will sit on his/her own seat).
But the real problem starts when we analyze Di for i = 2, …, 99. Because say, drunk man sat on 2nd person’s seat. Then now the 2nd person will sit randomly on one of the 99 empty seats. But again wherever he will sit the person with that seat number will randomly choose a seat among the empty seats and this mess continues.
As you may have a feel now of how difficult it becomes to keep track of conditional probabilities even for the A|D2 case. There are A|D3, …, A|D99. So, definitely this is not a tractable solution.
But we are missing a key ingredient about the structure of the problem i.e. the recursion — (very important observation) Once the drunk man takes the seat of 2nd person (or the ith person) then the 2nd person (or the ith person) becomes the new drunk man whose own seat is #1 (because, once he sits down in seat #1, all other persons following him will end up sitting on their own seats o.w. if he sits somewhere else the person of that seat will become our new drunk man). So, we can use this idea to reduce the problem into a recursive case as shown,
Now, that we have a recursive equation, we can use it artfully to try solve for f(n).
The above revealing identity means, we don’t need to care about the size of the problem i.e. whether it is 2 person or 50 person or 100 person, the probability of me getting my seat is always same. So, we can definitely without any worries reduce the problem to the simplest version with 2 person — me and the drunk man and 2 seats.
Quite interesting and surprising too. How can it be half? Is there a symmetry in play?? If so what is the symmetry i.e. what is the mapping between all the cases in [Me not getting my seat] to cases in [Me getting my seat] where both the cases are of equal probability.
Let’s try shedding light on the symmetry and a new approach to solve this problem (just in case you found the recursive approach not straightforward or non-trivial one), which I have never discussed in my blogs before. One that utilizes the philosophical quote at the start of this blog.
Conditional probabilities are strong concepts, only if you know how to condition on the right things. The initial strategy of conditioning on the seat of the drunk man led to a mess. But what caused that issue there? The repeatedly nested pattern of one guy taking another guy’s seat and so on. Hard to track right? Because we don’t know how long this will continue and the calculations became convoluted. There is one way out in such scenarios. The event [a man taking another man’s seat] started when the drunk man sat on a seat other than seat #1 and we are unaware about when it is going to end. We can condition on something which we know exists (why? since number of seats is bounded) and will happen (why? since everybody will be seated at the end) i.e. the event [last man to have a choice to sit on another man’s seat].
I present the argument below. You can see how elegant the argument goes and requires just petty calculations to arrive at the solution.
Now, the question is why are the conditional probabilities all equal to 1/2. That is because, if the i-th person is the last to make a choice on where to sit (i.e. because his seat is taken) then he will choose either #1 or #100, because if he chooses some other seat then the person of that seat also gets the choice to sit on someone else’s seat. But it will contradict our assumption or given condition that i-th person is the last to make a choice. Once, this i-th person takes seat #1, everyone else after him will take his own seat including me and if he takes seat #100, then everyone else after him will take his own seat except me, who will end up in seat #1.
So, the symmetry between the cases in [Me not getting my seat] to the cases in [Me getting my seat] is that if the last person is asked to make a choice to sit on someone else’s seat and takes seat #1, then that arrangement of people belongs to [Me getting my seat]; whereas if he takes seat #100 then that arrangement of people belongs to [Me not getting my seat]. Both of these arrangements have equal probability. This gives us our required mapping.
We reduced a seemingly challenging problem into a two liner solution with trivial calculations with a clever conditioning statement. Let’s check the validity of our solution in the next section.
The Python code for simulation is provided below:
# import necessary libraries
import numpy as np
# simulates one round of drunk passenger problem
def getSeat(n):
# set the empty seats list
seats = list(np.arange(n) + 1)
# flag to store whether last person gets his own seat
flag = 0
# iterating over people
for i in range(1, n+1):
# if people has his seat number empty and it's not the drunk guy he takes his seat
if i in seats and i != 1:
seats.remove(i)
# sits on his own seat
flag = 1
else:
# otherwise he randomly chooses from the remaining seats and takes it
rand_idx = np.random.choice(seats, 1)
seats.remove(rand_idx)
# sits on someone else's seat
flag = 0
return flag
# run simulation
NUM_SEATS = 100
N = 100000
print(f'Proportion of time in {N} simulations last person got his own seat:', np.mean([getSeat(NUM_SEATS) for _ in range(N)]))
Running it outputs the following:
Proportion of time in 100000 simulations last person got his own seat: 0.50231
Awesome! Our answer is indeed close to what the simulation results are. I hope you enjoyed this problem and learnt a new perspective of looking at problems which looks daunting at first glance. Hope to see you in an upcoming problem of the series :)