# How to measure distances between two coordinates: A data analyst’s guide

In my current job as an Analytics Lead at Grab, recently I’ve been working on geospatial analysis, one of the topics I’m working on is to compute the distance between coordinates.

In this opportunity, I want to share my understanding of what kind of distances we can use to calculate the distance, including the Python code to do those things. If you find errors in my explanation, please don’t hesitate to correct me, I would love to learn from my mistakes, it makes me know that I still have a long way to be better. Let’s learn together!

# 1. Haversine Distance

The theory is the Haversine distance using the formula to calculate the great circle distance between two points (longitude1, latitude1) and (longitude2, latitude2) on Earth’s surface. It’s the most common method when accuracy is important for short to medium distances.

**When to use**: Calculating distances for logistics, deliveries, and other location-based services.

`# import math package`

import math

# define the function to calculate haversine distance

def haversine(lat1, lon1, lat2, lon2):

R = 6371.0 # Earth's radius in kilometers

lat1, lon1, lat2, lon2 = map(math.radians, [lat1, lon1, lat2, lon2])

dlat = lat2 - lat1

dlon = lon2 - lon1

a = math.sin(dlat / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2)**2

c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))

return R * c

`import math`

imports the Python`math`

module, which provides mathematical functions like trigonometric and logarithmic calculations.- The function takes four arguments
`lat1`

latitude of the first point (in degrees),`lon1`

longitude of the first point (in degrees),`lat2`

latitude of the second point (in degrees), and`lon2`

longitude of the second point (in degrees). `R = 6371.0`

defines the radius of the Earth as 6371 kilometers. It is needed to convert the angle (in radians) to distance on the Earth's surface.`map(math.radians, [lat1, lon1, lat2, lon2])`

applies the`math.radians`

function to all the input latitude and longitude values because trigonometric functions in Python use radians, not degrees.`dlat = lat2 - lat1`

and`dlon = lon2 - lon1`

calculate the difference in latitude and longitude between the two points.`a`

will be the formula to calculate the Haversine distance, which calculates the central angle between two points on a sphere based on their latitudes and longitudes`c`

will be the central angle.`return R * c`

will be the result in kilometers.

**Example case:** Calculate the Haversine distance between the city of Jakarta and Singapore. Coordinates for each city are:

**Jakarta (latitude, longitude)**: (−6.2000, 106.8167)**Singapore (latitude, longitude)**: (1.3521, 103.8198)

So, we know now that if using the Haversine distance formula, then the distance between Jakarta and Singapore is 903.26 kilometers!

# 2. Vincenty Distance

The theory is the Vincenty distance using the formula to calculate the distance between two points (longitude1, latitude1) and (longitude2, latitude2) on Earth’s surface. It accounts for the Earth’s shape as an ellipsoid rather than a perfect sphere, which leads to greater precision in distance calculations, but this method is more computationally intensive.

**When to use**: Best used for high-precision tasks like aviation, long-range GPS calculations, and satellite data analysis.

`import math`

def vincenty(lat1, lon1, lat2, lon2):

# WGS-84 ellipsoid parameters

a = 6378137.0 # Major radius [meters]

f = 1 / 298.257223563 # Flattening

b = (1 - f) * a # Minor radius

# Convert lat/lon from degrees to radians

lat1, lon1, lat2, lon2 = map(math.radians, [lat1, lon1, lat2, lon2])

# Differences in coordinates

L = lon2 - lon1

U1 = math.atan((1 - f) * math.tan(lat1))

U2 = math.atan((1 - f) * math.tan(lat2))

sinU1 = math.sin(U1)

cosU1 = math.cos(U1)

sinU2 = math.sin(U2)

cosU2 = math.cos(U2)

# Iterative process to calculate lambda

lambda_ = L

iter_limit = 100

for i in range(iter_limit):

sin_lambda = math.sin(lambda_)

cos_lambda = math.cos(lambda_)

sin_sigma = math.sqrt((cosU2 * sin_lambda)**2 +

(cosU1 * sinU2 - sinU1 * cosU2 * cos_lambda)**2)

if sin_sigma == 0:

return 0 # Points are coincident

cos_sigma = sinU1 * sinU2 + cosU1 * cosU2 * cos_lambda

sigma = math.atan2(sin_sigma, cos_sigma)

sin_alpha = cosU1 * cosU2 * sin_lambda / sin_sigma

cos2_alpha = 1 - sin_alpha**2

cos2_sigma_m = cos_sigma - 2 * sinU1 * sinU2 / cos2_alpha

C = f / 16 * cos2_alpha * (4 + f * (4 - 3 * cos2_alpha))

lambda_prev = lambda_

lambda_ = L + (1 - C) * f * sin_alpha * (sigma + C * sin_sigma *

(cos2_sigma_m + C * cos_sigma *

(-1 + 2 * cos2_sigma_m**2)))

if abs(lambda_ - lambda_prev) < 1e-12:

break

else:

return None # formula failed to converge

u_squared = cos2_alpha * (a**2 - b**2) / b**2

A = 1 + u_squared / 16384 * (4096 + u_squared * (-768 + u_squared * (320 - 175 * u_squared)))

B = u_squared / 1024 * (256 + u_squared * (-128 + u_squared * (74 - 47 * u_squared)))

delta_sigma = B * sin_sigma * (cos2_sigma_m + B / 4 *

(cos_sigma * (-1 + 2 * cos2_sigma_m**2) -

B / 6 * cos2_sigma_m * (-3 + 4 * sin_sigma**2) *

(-3 + 4 * cos2_sigma_m**2)))

s = b * A * (sigma - delta_sigma)

return s # Distance in meters

# Example: Calculate the Vincenty distance between Jakarta and Singapore

jakarta_lat, jakarta_lon = -6.2000, 106.8167

singapore_lat, singapore_lon = 1.3521, 103.8198

distance = vincenty(jakarta_lat, jakarta_lon, singapore_lat, singapore_lon)

print(f"The Vincenty distance between Jakarta and Singapore is {distance / 1000:.2f} kilometers.")

Wow, that was a long formula, right? So, here is the explanation … or there is not? Luckily, we can use a library available in Python to calculate Vincenty much easier. Check out the code below!

As we can see, the result is the same, which is 899.07 kilometers, using the long format or using `geodesic`

from the `geopy.distance`

library.

# 3. Euclidean Distance

I have discussed Euclidean distance in my previous post here, but let me share it again. The theory is Euclidean distance treats the Earth as flat (hello there, flat earthers!), that’s why it’s a good approximation when working with very small areas.

**When to use**: Use in local applications where the area is small, such as the distance between two points in a city or neighborhood.

`import math`

def euclidean(lat1, lon1, lat2, lon2):

x = (lon2 - lon1) * math.cos((lat1 + lat2) / 2 * math.pi / 180)

y = lat2 - lat1

return math.sqrt(x**2 + y**2) * 111.32 # Approximation of degrees to kilometers

Let’s try to calculate the distance between Jakarta and Singapore using this method.

# 4. Manhattan Distance

Manhattan distance is useful when movement is constrained by a grid, such as navigating city blocks or traveling along orthogonal roads, which maybe you can tell, it’s not good to calculate the distance between Jakarta and Singapore, but let’s find out!

Anyway, can you guess which color(s) is representing the Manhattan distance? You can write your answer in the comment section!

**Use Case**: Best for modeling movement through cities where roads and streets are laid out in a grid pattern.

`import math `

def manhattan(lat1, lon1, lat2, lon2):

lat_dist = abs(lat2 - lat1) * 111.32 # Convert degrees to kilometers

lon_dist = abs(lon2 - lon1) * 111.32 * math.cos(lat1 * math.pi / 180)

return lat_dist + lon_dist

# Choosing the Right Distance Formula

Based on my knowledge, rather than using everything to calculate the distances we are looking for, it is better to know which type of distance is used for specific needs.

**Haversine**: Balanced accuracy and computational efficiency for most geographic distance calculations.*(the most common)***Vincenty**: Highest accuracy for long-range distances, especially over large areas.**Euclidean**: For small-scale calculations where the Earth’s curvature doesn’t significantly affect the result.**Manhattan**: Best for grid-like environments, such as urban streets.

But, to know the result comparison of calculating the distance between Jakarta and Singapore using the four methods above, let’s see the table below:

# Conclusion

Geospatial analysis is a powerful tool, and understanding how to calculate distances could be a great additional skill for us. By selecting the right distance formula, we can optimize the accuracy and efficiency of the distance calculations. I hope this post is useful for all of us, thank you for reading!