Python: Use Delorean and Pandas to Calculate Your Next Flight Time
Manipulate Time Zone Aware and Naive datetime
objects
Scope
Flight tickets state the departure and arrival date and time but with their respective local Time Zones, which can make working out how long the flight is a bit tricky. In this article, the Delorean
library is introduced and along with pandas
, datetime
objects with Time Zone information are manipulated. As an example, the flight time between London Heathrow, UK and Kuala Lumpur, Malaysia are calculated.
Introduction
If you’re of a certain age, then the Back to the Future films starting in the 1980s with its the futuristic car was the best time travelling series to date. For everyone else it’s a Tony Stark reference from Avengers: Endgame about time travel. This is a long winded way to say that the library Delorean is named after the car from the Back to the Future films and goofy TV cartoons in the 1990s.
The Delorean
library provides an alternative interface to manage Time Zone aware objects based on dateutil
and pytz
. In this article I’ll introduce Delorean, create and manipulate datetime
objects and apply that capability to determine flight times from London Heathrow (LHR), United Kingdom of Great Britain to Kuala Lumpur, Malaysia (KUL).
Python Environment
Delorean Installation
The first step is setting up an appropriate environment; the gifcast below shows the process of cloning an environment with conda
before installing delorean
using pip
.
The key steps are checking and cloning an existing conda
environment (ds01
). Delorean
doesn’t appear to be available via conda-forge
, therefore a pip
install is required. Although conda
packages shouldn’t really mix with pip
packages, they are compatible if the pip
installation is the last item.
Jupyter Notebook Setup
The delorean
environment has jupyter
, pandas and associated packages pre-installed. For the purpose of providing a reproducible analysis the package versions are listed:
Delorean
Introduction
Initiating a delorean
object is the equivalent of generating a datetime
object for the current date time at the Universal Coordinated Time (UTC) zone. So that two users anywhere in the world, running the same command at the same time would get the same result. The second command Delorean.now()
localises the datetime
object to the user’s local Time Zone, which is Europe/London
for me.
Time Zone Shifting
The shift
command is used to change the Time Zone for the utc
variable from UTC
to Europe/London
; however it appears to change the original utc
variable as well, which is unexpected. To address this, the utc
variable has to be reinitialised:
The notebook snippet shows the creation of the utc
variable again, but also a separate london
variable, which takes the initialised Delorean
object and shifts it to the London Time Zone. The same goal can be achieved by passing the timezone=”Europe/London”
parameter to the Delorean
object.
The difference between the two variables (london
and utc
) is in the order of microseconds because both zones are the currently the same. The small difference exists because both variables were initialised separately; to use a common timebase, the Delorean
class accepts a Time Zone naive datetime
object:
As both variables are now using the same time, but in different Time Zones, the difference between them is of course zero.
Kuala Lumpur, Malaysia
To determine the Time Zone for Kuala Lumpur in Malaysia, the most common Time Zones can be listed using pytz.common_timezones
:
The list of common Time Zones can be parsed with a list comprehension looking for the keyword “Kuala” and shows that the appropriate Time Zone is Asia/Kuala_Lumpur
. The utc
variable is then used to initialise a new variable kuala_lumpur
with the Time Zone of the city. As expected the time difference between London and Kuala Lumpur is zero — the same time at two geographic locations will of course be the same!
Pandas DateTime
The following notebook shows similar steps as above but using pandas
:
The current date and time (without Time Zone information) is generated using pd.to_datetime(‘today’)
, which is then localised to Europe/London
using .tz_localize
and stored as pd_london
. The day and time in Kuala Lumpur, Malaysia can then be created using the astimezone(‘Asia/Kuala_Lumpur’)
function. Note the output of both variables showing the time difference and Time Zones. Unfortunately, unlike Delorean
there is no easy way to subtract the difference between the two Time Zone aware variables, hence the error message.
To mitigate the error the pandas
representation can be converted to datetime
objects using .to_pydatetime()
, which of course also shows zero time difference. It’s important to note that again this is as expected, the same time in two different geographic locations will have a time difference of zero.
To calculate the difference between two Time Zones requires the conversion from the Time Zone aware representations to Time Zone naive representations using .tz_localize(None)
, which as expected shows an 8 hour difference.
Flight Time Calculations
Pandas Approach
Having understood the basics of Time Zone manipulations using both Delorean
and pandas
, we can apply it to a real life application. We’re going to determine the flight time for the last two flights between London Heathrow, UK and Kuala Lumpur, Malaysia. The data is from Flight Radar 24, for Malaysia Airlines flight MH3 / MAS3:
The notebook snippet shows the creation of a DataFrame
(df
) using a dictionary of lists based on the Flight Radar 24 data. It consists of departure and arrival times and Time Zone information.
It may be tempting to convert the date and time information into native pandas
datetime
objects. Note that these are Time Zone naive representations. A simple subtraction of the two times shows a 19 hour and 40 min flight time, which is inconsistent with the actual flight time. The reason is that the time representations do not account for the two differing Time Zones.
The code above first of all localizes the departure and arrival times to their respective Time Zones (using .tz_localize
). Notice how the datatypes
information has the Time Zone encoding. To calculate the flight time, a subtraction of the two columns is insufficient as pandas does not support such actions with two different Time Zones. One of the columns has to be converted (using .dt.tz_convert
) to a common Time Zone. The results show the correct flight times of around 11 hours 40 mins. Simple mental arithmetic should also demonstrate that the previous 19 hours and 40 mins adjusted for the 8 hour time difference between Zones yields around 11 hours and 40 mins.
Sanity Check With Delorean
As a final check, a departure and arrival Time Zone naive datetime
objects are created with datetime.datetime.strptime
- where strptime
uses a string and a format to generate a datetime
object. A Delorean
object is then initialised with the two respective Time Zones. The difference between the two objects is expressed as a timedelta
object. A limitation with the timedelta
format is that differences are expressed in a combination of days
and seconds
and not more appropriate units such as hours:
The sanity check shows that the difference between the two times is consistent with the previous calculation with pandas
.
Concluding Remarks
We’ve touched on what Time Zone naive and aware datetime
objects are and introduced Delorean
as an alternative interface for time manipulation. We’ve also looked at how pandas
performs similar calculations and applied it to a real life use case of calculating flight times.
Having used Delorean
, although it has some nice features such as .humanize()
, it hasn’t been updated since 2018 and can be clunky. Therefore, I would continue to recommend using pandas
instead.
Attribution
All gists
, notebooks and terminal casts are by the author. All of the artwork is based on assets with CC0 or Public Domain license or SIL OFL and is therefore non-infringing.
The Python logo is used consistent with the Python Software Foundation guidelines.
Theme is inspired by and based on my favourite vim
theme: Gruvbox.