ESTIMATING DISTANCE AND TIME USING PYTHON AND GOOGLE MAPS PLATFORM API.

Lovellkaringe
10 min readMar 18, 2024

--

Greetings!

For my first article, we are going to use Python and notebooks to explore a simple use case for Google Cloud’s Distance Matrix API. This API is one of many Google Maps Platform APIs and is used to compute travel time and distance for multiple destinations, based on the transportation mode and other parameters. The API provides information based on the recommended route between start and end points using the available road network. (Does not use Euclidian distances (straight-line distances))

The objective is to estimate the distance between various locations and related travel time. We will be using the driving mode, but it is possible to use walking, bicycling and transit modes depending on your objectives. This will be accomplished in 3 simple steps:

· Setup of a Google cloud project and python notebook environments.

· Establish start points and destinations.

· Code requests to the API and handle responses to extract the time and distance estimates.

Such estimates can be useful in real-world scenarios to feed models that inform decision-making regarding efficiency in distribution of resources, particularly where one has multiple areas of operation.

First,

We need to setup a Google Maps Platform project on Google cloud to access the API. For first time users, you can get up to 90 days and $300 free trial (as of March 2024) after setting up your account after which some features will be restricted until you upgrade to a full account. A full guide on how to get started setting up an account and creating a project is available here.

After setting up an account and creating a project, we proceed to enable the Distance matrix API and setting up our API keys, which we will use to access the API later. A full guide on how to enable and setup your API keys for the Distance Matrix API is available here.

With the API setup complete, we can now setup our python environment. Use of Anaconda Jupyter notebooks, Google Colab notebooks or a code editor to manage the notebook are all viable options. Google Colab is used for the coding as it has the least setup time, is free and easy to use and share. To setup your Colab environment, follow the guide here.

Second,

We need some data to work with. We are dealing with locations hence we require spatial data describing the position of the points of interest. This can be in the form of a place ID, an address, or latitude/longitude coordinates. We will be using the latitude/longitude coordinates for our start points and destinations. The coordinates must be separated by a comma without any spaces and begin with the latitude first as prescribed in the documentation.

Individual coordinates for your places of interest can be obtained by finding the place on Google maps and right clicking on its icon, then clicking on its coordinates to copy them. Alternatively, you can get free location data from various open-source data providers online or from a database. For this use case, we will use 3 starting points to represent warehouses, 2 destination points to represent partners and a list of 200 points to represent distribution points, all of which were generated randomly. Here is the link to the list of 200 randomly generated points.

Then,

Its finally time to write some code!

We begin by installing the required packages and importing them to our environment. We will use Googlemaps package to access the distance matrix API and pandas package to manage the dataframe containing our coordinate list.

# INSTALL REQUIRED PACKAGES AND MAKE IMPORTS

# !pip install googlemaps # IF NOT ALREADY INSTALLED

import googlemaps
import pandas

Next, we initialize our google maps platform service using the API key we created in the first step.

# INITIALIZE GOOGLE CLOUD SERVICE

GOOGLE_MAPS_API_KEY = " <INSERT YOUR API KEY HERE> "

# Initiate Google Maps Platform API
gmaps = googlemaps.Client(key=GOOGLE_MAPS_API_KEY)

We then declare the variables for our start points (origins), which will represent our warehouses, and endpoints (destinations), which will represent our partners/suppliers. We also load the CSV file(s) containing our list of coordinates, which will represent our distribution points/clients.

# DECLARE STARTPOINT(s) AND DESTINATION(s)

# ---STARTPOINTS---
# Startpoints
startpoint1 = (-1.1374252134825595,36.91097136121056) # SP1
startpoint2 = (-1.1510145675699641,36.642598387166494) # SP2
startpoint3 = (-1.1201086824168112,36.914511961509476) # SP3

# ---DESTINATIONS---
# Declare destinations
endpoint = (-1.215808447681381,36.99104601796648) # EP
endpoint2 = (-1.2434231212273734,36.87341187999162) # EP2

# ---MULTIPLE DESTINATIONS---
# Import csv file with a list of (lat, long) coordinates

# 1) ---jupyternotebook---
coordinate_data = pandas.read_csv(" <Filename> ")

# 2) ---Google Colab---
from google.colab import files
uploaded = files.upload() # This will prompt a file upload from your drive

# Create dataframe from the imported CSV files

# 1) ---jupyternotebook---
df = coordinate_data

# 2) ---Google Colab---
df = pandas.read_csv(io.BytesIO(uploaded[' <Filename of uploaded file> ']))

df = df.dropna() # Drop empty rows
df.head(5)

Let’s have a glimpse of the coordinates data.

Now, we begin making some time and distance estimates by prompting a response from the API then extracting the time and distance values from the JSON result and making necessary conversions and formatting. In this case we need our distance in kilometers and time in minutes. We choose one warehouse, SP1, and a partner, EP, for this.

# USING GOOGLEMAPS DISTANCE MATRIX API TO ESTIMATE TIME AND DISTANCE FROM A POINT TO A POINT

result = gmaps.distance_matrix(origins = (startpoint1), destinations = (endpoint), mode='driving')

distresult = result["rows"][0]["elements"][0]["distance"]["value"]/1000 # Convert to KM by /1000
timeresult = result["rows"][0]["elements"][0]["duration"]["value"]/60 # Convert to Minutes by /60

print("Distance (KM) SP1 to EP: " , round(distresult,1), '\n', "Time (MINS) SP1 to EP: ", round(timeresult,0))

We get the following output.

But wait, how does this compare to the values from a Google maps search?? We use the same coordinates on google maps and voila! The results are reasonably close. 200m and 1 minute apart comparing with the best route values. This is good given that the algorithm snaps the coordinates somewhere along the nearest road if they are not already on a road. This error margin seems acceptable. We proceed.

Next, we use multiple start points with a single destination. We order the list of start points in the request and go through the result in the same order. Each row in the response result contains one origin paired with the destination. All our warehouses (SP1, SP2, SP3) are used as start points and partner EP2 is the destination.

# USING GOOGLEMAPS DISTANCE MATRIX API TO ESTIMATE TIME AND DISTANCE FROM MULTIPLE POINTS TO A POINT.

# List the startpoints in the origins variable - (lat,long),(lat,long),(lat,long)...
result = gmaps.distance_matrix(origins = [(startpoint1),(startpoint2),(startpoint3)], destinations=(endpoint2), mode='driving')

dist_SP1_to_EP2 = round(result["rows"][0]["elements"][0]["distance"]["value"]/1000,1)
dist_SP2_to_EP2 = round(result["rows"][1]["elements"][0]["distance"]["value"]/1000,1)
dist_SP3_to_EP2 = round(result["rows"][2]["elements"][0]["distance"]["value"]/1000,1)

time_SP1_to_EP2 = round(result["rows"][0]["elements"][0]["duration"]["value"]/60,0)
time_SP2_to_EP2 = round(result["rows"][1]["elements"][0]["duration"]["value"]/60,0)
time_SP3_to_EP2 = round(result["rows"][2]["elements"][0]["duration"]["value"]/60,0)

print('\n', "Startpoint1 to endpoint2:", '\n', "Distance (KM): " , dist_SP1_to_EP2, ' ', "Time (MINS): ", time_SP1_to_EP2, '\n',
'\n', "Startpoint2 to endpoint2:", '\n', "Distance (KM): " , dist_SP2_to_EP2, ' ', "Time (MINS): ", time_SP2_to_EP2, '\n',
'\n', "Startpoint3 to endpoint2:", '\n', "Distance (KM): " , dist_SP3_to_EP2, ' ', "Time (MINS): ", time_SP3_to_EP2, '\n')

We add another destination to the query, partner EP. Each element within a row contains one destination and the results are ordered in the same way as listed in the query.

# USING GOOGLEMAPS DISTANCE MATRIX API TO ESTIMATE TIME AND DISTANCE FROM MULTIPLE POINTS TO MULTIPLE POINTS.

# List the startpoints and destinations in the origins variable - (lat,long),(lat,long),(lat,long)...
result = gmaps.distance_matrix(origins = [(startpoint1),(startpoint2)], destinations=[(endpoint),(endpoint2)], mode='driving')

dist_SP1_to_EP = round(result["rows"][0]["elements"][0]["distance"]["value"]/1000,1)
dist_SP1_to_EP2 = round(result["rows"][0]["elements"][1]["distance"]["value"]/1000,1)
dist_SP2_to_EP = round(result["rows"][1]["elements"][0]["distance"]["value"]/1000,1)
dist_SP2_to_EP2 = round(result["rows"][1]["elements"][1]["distance"]["value"]/1000,1)

time_SP1_to_EP = round(result["rows"][0]["elements"][0]["duration"]["value"]/60,0)
time_SP1_to_EP2 = round(result["rows"][0]["elements"][1]["duration"]["value"]/60,0)
time_SP2_to_EP = round(result["rows"][1]["elements"][0]["duration"]["value"]/60,0)
time_SP2_to_EP2 = round(result["rows"][1]["elements"][1]["duration"]["value"]/60,0)

print('\n', "Startpoint1 to endpoint:", '\n', "Distance (KM): " , dist_SP1_to_EP, ' ', "Time (MINS): ", time_SP1_to_EP, '\n',
'\n', "Startpoint1 to endpoint2:", '\n', "Distance (KM): " , dist_SP1_to_EP2, ' ', "Time (MINS): ", time_SP1_to_EP2, '\n',
'\n', "Startpoint2 to endpoint:", '\n', "Distance (KM): " , dist_SP2_to_EP, ' ', "Time (MINS): ", time_SP2_to_EP, '\n',
'\n', "Startpoint2 to endpoint2:", '\n', "Distance (KM): " , dist_SP2_to_EP2, ' ', "Time (MINS): ", time_SP2_to_EP2, '\n')

Now that we have a basic understanding of the query result format, let’s get time and distance estimates for many destinations (200 clients) from a start point, warehouse SP2. We do so by looping through the dataframe containing a list of latitude/longitude coordinates for our ‘clients’ and saving the results back to the dataframe.

# USING GOOGLEMAPS DISTANCE MATRIX API TO ESTIMATE TIME AND DISTANCE FROM A POINT TO A LIST OF MANY POINTS.

# Declare the column holding the coordinates (lat,long)
destinations= df.COORDS # 'COORDS' is the column name containing (lat,long) values

# Declare empty arrays for the distance and time values
DISTANCES = []
TIMES = []

# Iterate through the rows of the dataframe saving the distance and time estimates for each to the coordinate points
for destination in destinations:
result = gmaps.distance_matrix(origins = (startpoint2), destinations = (destination), mode='driving')
# Check the status of the result to prevent zero_result from raising key errors
status = result["rows"][0]["elements"][0]["status"]

# Save the value if status is okay
if status == "OK":
distresult = result["rows"][0]["elements"][0]["distance"]["value"]/1000
timeresult = result["rows"][0]["elements"][0]["duration"]["value"]/60

DISTANCES.append(round(distresult,1))
TIMES.append(round(timeresult,0))
# if not, set the value to -404 to keep type as int and distinguish from same location which has 0 distance and time
else:
dist = -404
time = -404

DISTANCES.append(dist)
TIMES.append(time)

# Create new columns in the dataframe to store the distance and time estimate values
df["time_SP2(MINS)"] = TIMES
df["distance_SP2(KM)"] = DISTANCES

# Check first 5 rows
df.head(5)

# EXPORT DATAFRAME AS CSV
df.to_csv('df.csv') # Saves to current working directory for jupyter -- alternatively, add path to desired directory
files.download('df.csv') # ---Google Colab--- to downoad the file to your local machine

Finally, we get some estimates from all our start points (warehouses) to the 200 destinations representing our clients. This is like the previous estimation but taking into account the various start points.

# USING GOOGLEMAPS DISTANCE MATRIX API TO ESTIMATE TIME AND DISTANCE FROM MULTIPLE POINTS TO A LIST OF MANY POINTS.

# Declare the column holding the coordinates (lat,long)
destinations= df.COORDS # 'COORDS' is the column name containing (lat,long) values

# Declare empty arrays for the distance and time values with regard to each starting point
DISTANCES_SP1 = []
TIMES_SP1 = []
DISTANCES_SP2 = []
TIMES_SP2 = []
DISTANCES_SP3 = []
TIMES_SP3 = []

# function to append data to empty arrays based on API result
def append_data(result, status, i, DISTANCES, TIMES):
# Save the value if status is okay
if status == "OK":
distresult = result["rows"][i]["elements"][0]["distance"]["value"]/1000
DISTANCES.append(round(distresult,1))

timeresult = result["rows"][i]["elements"][0]["duration"]["value"]/60
TIMES.append(round(timeresult,0))
# if not, set the value to -404 to keep type as int and distinguish from same location which has 0 distance and time
else:
dist = -404
time = -404
DISTANCES.append(dist)
TIMES.append(time)

# Iterate through the rows of the dataframe saving the distance and time estimates for each to the coordinate points
for destination in destinations:
result = gmaps.distance_matrix(origins = [(startpoint1),(startpoint2),(startpoint3)], destinations=(destination), mode='driving')

# Check the status of the result to prevent zero_result from raising key errors
status1 = result["rows"][0]["elements"][0]["status"]
status2 = result["rows"][1]["elements"][0]["status"]
status3 = result["rows"][2]["elements"][0]["status"]

# Append distance and time data to empty arrays for each startpoint
append_data(result, status1, 0, DISTANCES_SP1, TIMES_SP1)
append_data(result, status2, 1, DISTANCES_SP2, TIMES_SP2)
append_data(result, status3, 2, DISTANCES_SP3, TIMES_SP3)

# Create new columns in the dataframe to store the distance and time estimate values
df["time_SP1(MINS)"] = TIMES_SP1
df["distance_SP1(KM)"] = DISTANCES_SP1
df["time_SP2(MINS)"] = TIMES_SP2
df["distance_SP2(KM)"] = DISTANCES_SP2
df["time_SP3(MINS)"] = TIMES_SP3
df["distance_SP3(KM)"] = DISTANCES_SP3

# Check first 5 rows
df.head(5)

# EXPORT DATAFRAME AS CSV
df.to_csv('df.csv') # Saves to current working directory for jupyter -- alternatively, add path to desired directory
files.download('df.csv') # ---Google Colab--- to downoad the file to your local machine

Just. Like. That!

We have gotten time and distance estimates from three start points to 200 destinations in minutes thanks to the power of Python and the Google Maps Platform API. Here is a link to all the complete code repository.

It is worth noting that there are several parameters that can be used to tailor-make the results, that are not included in the queries we have used. These include; avoid (highways, tolls), departure_time, traffic_model (optimistic, pessimistic), language, units, among others. These can be included to get more concise results rather than best estimates and average traffic condition results which we have obtained. Find the full list of parameters here. Also, the margin of error between the estimates and the values obtained from Google maps increases the further apart the locations are, but not so much as to render the estimates unusable.

All in all, these results are solid for our use case, and we can use them to feed machine learning models to identify the most efficient distribution strategies for our theoretical warehouses. As a parting shot, think of how to estimate time and distance between a long list (100+) of start points and a long list of destination points like the one used in this article.

Thank you for getting this far. Look out for more content on Google Maps platform APIs!

--

--

Lovellkaringe

Data enthusiast and GIS practitioner with deep interest in philosophy and music as well.