TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial…

Python: Use Delorean and Pandas to Calculate Your Next Flight Time

Ashraf Miah
TDS Archive
Published in
6 min readMar 14, 2021

--

Photo by Ramon Kagie on Unsplash

Scope

Flight tickets state the departure and arrival date and time but with their respective local Time Zones, which can make working out how long the flight is a bit tricky. In this article, the Delorean library is introduced and along with pandas, datetime objects with Time Zone information are manipulated. As an example, the flight time between London Heathrow, UK and Kuala Lumpur, Malaysia are calculated.

Image by the Author | See below for full Attribution details

Introduction

If you’re of a certain age, then the Back to the Future films starting in the 1980s with its the futuristic car was the best time travelling series to date. For everyone else it’s a Tony Stark reference from Avengers: Endgame about time travel. This is a long winded way to say that the library Delorean is named after the car from the Back to the Future films and goofy TV cartoons in the 1990s.

The Delorean library provides an alternative interface to manage Time Zone aware objects based on dateutil and pytz. In this article I’ll introduce Delorean, create and manipulate datetime objects and apply that capability to determine flight times from London Heathrow (LHR), United Kingdom of Great Britain to Kuala Lumpur, Malaysia (KUL).

Python Environment

Delorean Installation

The first step is setting up an appropriate environment; the gifcast below shows the process of cloning an environment with conda before installing delorean using pip.

Conda Set-up Gist | by the Author

The key steps are checking and cloning an existing conda environment (ds01). Delorean doesn’t appear to be available via conda-forge, therefore a pip install is required. Although conda packages shouldn’t really mix with pip packages, they are compatible if the pip installation is the last item.

Conda Environment Set-up | Cast by Author

Jupyter Notebook Setup

The delorean environment has jupyter , pandas and associated packages pre-installed. For the purpose of providing a reproducible analysis the package versions are listed:

Jupyter Notebook Imports | by the Author

Delorean

Introduction

Delorean Intro | by the Author

Initiating a delorean object is the equivalent of generating a datetime object for the current date time at the Universal Coordinated Time (UTC) zone. So that two users anywhere in the world, running the same command at the same time would get the same result. The second command Delorean.now() localises the datetime object to the user’s local Time Zone, which is Europe/London for me.

Time Zone Shifting

Notebook Snippet | by the Author

The shift command is used to change the Time Zone for the utc variable from UTC to Europe/London; however it appears to change the original utc variable as well, which is unexpected. To address this, the utc variable has to be reinitialised:

Notebook Snippet | by the Author

The notebook snippet shows the creation of the utc variable again, but also a separate london variable, which takes the initialised Delorean object and shifts it to the London Time Zone. The same goal can be achieved by passing the timezone=”Europe/London” parameter to the Delorean object.

The difference between the two variables (london and utc) is in the order of microseconds because both zones are the currently the same. The small difference exists because both variables were initialised separately; to use a common timebase, the Delorean class accepts a Time Zone naive datetime object:

Notebook Snippet | by the Author

As both variables are now using the same time, but in different Time Zones, the difference between them is of course zero.

Kuala Lumpur, Malaysia

To determine the Time Zone for Kuala Lumpur in Malaysia, the most common Time Zones can be listed using pytz.common_timezones:

Notebook Snippet | by the Author

The list of common Time Zones can be parsed with a list comprehension looking for the keyword “Kuala” and shows that the appropriate Time Zone is Asia/Kuala_Lumpur. The utc variable is then used to initialise a new variable kuala_lumpur with the Time Zone of the city. As expected the time difference between London and Kuala Lumpur is zero — the same time at two geographic locations will of course be the same!

Pandas DateTime

The following notebook shows similar steps as above but using pandas:

Notebook Snippet | by the Author

The current date and time (without Time Zone information) is generated using pd.to_datetime(‘today’), which is then localised to Europe/London using .tz_localize and stored as pd_london . The day and time in Kuala Lumpur, Malaysia can then be created using the astimezone(‘Asia/Kuala_Lumpur’) function. Note the output of both variables showing the time difference and Time Zones. Unfortunately, unlike Delorean there is no easy way to subtract the difference between the two Time Zone aware variables, hence the error message.

Notebook Snippet | by the Author

To mitigate the error the pandas representation can be converted to datetime objects using .to_pydatetime(), which of course also shows zero time difference. It’s important to note that again this is as expected, the same time in two different geographic locations will have a time difference of zero.

To calculate the difference between two Time Zones requires the conversion from the Time Zone aware representations to Time Zone naive representations using .tz_localize(None), which as expected shows an 8 hour difference.

Flight Time Calculations

Photo by Nysa Zainal on Unsplash

Pandas Approach

Having understood the basics of Time Zone manipulations using both Delorean and pandas, we can apply it to a real life application. We’re going to determine the flight time for the last two flights between London Heathrow, UK and Kuala Lumpur, Malaysia. The data is from Flight Radar 24, for Malaysia Airlines flight MH3 / MAS3:

Notebook Snippet | by the Author

The notebook snippet shows the creation of a DataFrame (df) using a dictionary of lists based on the Flight Radar 24 data. It consists of departure and arrival times and Time Zone information.

Notebook Snippet | by the Author

It may be tempting to convert the date and time information into native pandas datetime objects. Note that these are Time Zone naive representations. A simple subtraction of the two times shows a 19 hour and 40 min flight time, which is inconsistent with the actual flight time. The reason is that the time representations do not account for the two differing Time Zones.

Notebook Snippet | by the Author

The code above first of all localizes the departure and arrival times to their respective Time Zones (using .tz_localize). Notice how the datatypes information has the Time Zone encoding. To calculate the flight time, a subtraction of the two columns is insufficient as pandas does not support such actions with two different Time Zones. One of the columns has to be converted (using .dt.tz_convert) to a common Time Zone. The results show the correct flight times of around 11 hours 40 mins. Simple mental arithmetic should also demonstrate that the previous 19 hours and 40 mins adjusted for the 8 hour time difference between Zones yields around 11 hours and 40 mins.

Sanity Check With Delorean

As a final check, a departure and arrival Time Zone naive datetime objects are created with datetime.datetime.strptime - where strptime uses a string and a format to generate a datetime object. A Delorean object is then initialised with the two respective Time Zones. The difference between the two objects is expressed as a timedelta object. A limitation with the timedelta format is that differences are expressed in a combination of days and seconds and not more appropriate units such as hours:

Notebook Snippet | by the Author

The sanity check shows that the difference between the two times is consistent with the previous calculation with pandas.

Concluding Remarks

We’ve touched on what Time Zone naive and aware datetime objects are and introduced Delorean as an alternative interface for time manipulation. We’ve also looked at how pandas performs similar calculations and applied it to a real life use case of calculating flight times.

Having used Delorean, although it has some nice features such as .humanize(), it hasn’t been updated since 2018 and can be clunky. Therefore, I would continue to recommend using pandas instead.

Attribution

All gists , notebooks and terminal casts are by the author. All of the artwork is based on assets with CC0 or Public Domain license or SIL OFL and is therefore non-infringing.

The Python logo is used consistent with the Python Software Foundation guidelines.

Theme is inspired by and based on my favourite vim theme: Gruvbox.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Ashraf Miah
Ashraf Miah

Written by Ashraf Miah

CTO, Data Scientist & Chartered Engineer (MEng CEng EUR ING MRAeS) with over 20 years experience in the Aerospace, Rail & Energy Industry.