Calculating Distance Between Successive Latitude-Longitude Coordinates using Pandas

Introduction:

Manish Singh
2 min readMar 18, 2024

When working with geographic data, understanding distances between points is often crucial. Whether you’re analyzing travel routes, mapping locations, or studying spatial patterns, calculating distances between latitude-longitude coordinates is a common task. In this blog post, we’ll explore how to achieve this using Python, specifically with Pandas and the Haversine formula.

Photo by Sebastian Hietsch on Unsplash

Understanding the Haversine Formula:

The Haversine formula is a mathematical formula used to calculate the shortest distance between two points on a sphere given their longitudes and latitudes. It’s particularly useful for computing distances on the Earth’s surface, which is approximately spherical. The formula considers the curvature of the Earth, making it more accurate than simpler methods that assume a flat Earth.

Requirements:

To follow along with this tutorial, you’ll need:

  1. Python installed on your system.
  2. Basic knowledge of Python programming.
  3. Familiarity with Pandas, a powerful data manipulation library in Python.

Step-by-Step Implementation:

1. Importing Required Libraries:

We’ll start by importing the necessary libraries for our task:

import pandas as pd
import numpy as np

2. Loading Latitude-Longitude Data:

Next, we’ll load our latitude-longitude data into a Pandas DataFrame. For demonstration purposes, we’ll create a simple DataFrame with latitude and longitude columns.

data = {
'Latitude': [40.7128, 34.0522, 37.7749, 32.7157],
'Longitude': [-74.0060, -118.2437, -122.4194, -117.1611]
}
df = pd.DataFrame(data)

3. Calculating Distance using Haversine Formula:

Now, let’s define a function to calculate the distance between two pairs of latitude-longitude coordinates using the Haversine formula.

def haversine(lat1, lon1, lat2, lon2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# Convert latitude and longitude from degrees to radians
lat1, lon1, lat2, lon2 = map(np.radians, [lat1, lon1, lat2, lon2])

# Haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
c = 2 * np.arcsin(np.sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r

4. Applying the Function to DataFrame:

We’ll apply this function to each successive pair of coordinates in our DataFrame to calculate distances.

df['Distance'] = haversine(df['Latitude'].shift(), df['Longitude'].shift(), df['Latitude'], df['Longitude'])

Conclusion:

In this tutorial, we’ve learned how to calculate distances between successive latitude-longitude coordinates using the Haversine formula in Python with Pandas. This technique is essential for various applications involving geographic data analysis, such as route optimization, location-based services, and spatial data visualization.

--

--

Manish Singh

Data Scientist || Co-founder at College Conexion || Founder & Board Member Vriksha Foundation Society