Using Python to Batch Convert CSV to Google Street View Images

Greg
7 min readJan 5, 2023

--

Test data used for this project was downloaded from kaggle here.

Github with all the code and examples is located here.

Have you ever wanted to download a set of Google Street View images for a specific set of addresses? In this tutorial, we will walk through the process of creating a Python script that does just that.

Prerequisites

Before we begin, make sure you have the following tools installed on your machine:

  • Python 3
  • pip (Python package manager)

We will also be using the following extra Python library in this tutorial:

  • tqdm for tracking progress of our multiprocessing

Use the following command to install these libraries:

pip3 install tqdm

Directory setup:

We are going to setup some folder structure ahead of time to keep things a little cleaner as we go along. Starting in the directory we will be working in, create a data folder and an output folder.

  • data will hold the csv file(address.csv) with all the addresses we will be converting into streetview images.
  • output will be the location of all the images downloaded from streetview in jpg format.

Now we will go outside the working directory (/documents/workingdirectory/ to /documents/) and create an empty textfile for our key we will get in a moment, name the text file (k.txt).

Google StreetView API key setup:

Quick note: This API is a paid service, but has a free tier so make sure you check googles fees for this API service to see if this works for you.

I wont be going in depth for the API key setup as there are plenty of other guides out there and I want to focus on the python script more in this write up.

Google has a good guide to get you started with the API key here.

The Google StreetView API pricing tier is explained here here .

Once we have the API key from google, copy it into the (k.txt) file created earlier. make sure there are no spaces or anything else in that text file, just the key from Google.

Grabbing test data to work with:

We will be using a csv file with addresses from kaggle.

  • proceed to the following Kaggle link.
  • Click the download button and save the CSV file to your data folder (you may need to create a kaggle account, which is free).
  • Once downloaded, convert the file if needed to text csv format.
  • rename the file (address.csv).

Python

Now we will finally start digging into python and write our code.

Step 1: Importing libraries

We will begin with creating our python script. you can call it anything you want, we are only making one python file. I named mine map_no_gui.py because I am working on a version of this project that implements a GUI separately from this project.

touch map_no_gui.py

Next open up your new python script in your favorite editor for python.

Now lets import our libraries:

import csv #this library will be used to access the csv source file
import os #this library helps us navigate our files
import urllib.parse
import urllib.request #these two will help us send and receive information to google streetview API
import glob #This library will be used to clear out old output files when nedded
import time #time will be used to assist with tracking the time diffrence to process the script under diffrent rules
from multiprocessing import Pool #this library enables multiprocessing allowing us to run multiple requests to the StreetView API at the same time
from tqdm import tqdm #tqdm is a great library to provide a progress information on a multiprocessing workload in python
from os.path import exists #this will be used to verify the necesary files are available for the script to run

Step 2: Verifying local files

The first thing we want to do is have a couple checks to make sure the files we need exist, if they don't we display an error and quit() the script.

if not exists('../k.txt') or not exists('data/address.csv'):
print('Files missing! Verify key file and input data locations.')
quit()

If the code continues and does not quit(), that means everything is setup correctly and we can begin to extract the key info from our k.txt file we created earlier and setting that part up.

with open('../k.txt') as key_file:
key_txt = key_file.readline().strip('\n')

key = "&key=" + key_txt

Step 3: Processing the CSV file

Now lets import the CSV file into a list. The address is specifically formatted to work with the API request we will be sending later. The pop(0) at the end is to remove the headers that were also imported but we wont be using any more.

full_add_list = []

with open('data/address.csv', mode='r') as address_csv:
reader = csv.reader(address_csv)
for row in reader:
full_add_list.append(row[0] + ', ' + row[1] + ', ' + row[2] + ' ' + row[3])

full_add_list.pop(0)

Step 4: User input

Now that everything is setup it’s time to get the users intentions dialed in for the script. We will have a loop for each input that only breaks if the user inputs the correct option. The first one asks what speed the program should run at. Although fast will complete the task faster, there is a possibility of too many requests causing either a slow down from google or breaking out of the free tier into the paid portion of the API or even both so be cautious. In the next step we will put in our values that correlate from fast/slow to the amount of processes to be run simultaneously.

The following code also requests if the user wants to delete all files in the output directory to prevent confusion and clutter. Then at the end here we are calling a function we will be creating in the next step called pool_handler() which handles the multiprocessing.

speed_input = 'a'
while speed_input != 'fast' and speed_input != 'slow':
speed_input = input('Input program speed (fast) or (slow): ')
output_clean = 'b'
while output_clean != 'y' and output_clean != 'n':
output_clean = input('Delete all files in output directory(y) or (n)?')

pool_handler()

Step 5: Defining multiprocessing function

Next we will be defining a couple functions. These functions should be placed after the full_add-list.pop(0) line. First up is the pool_handler() function to manage our multiprocessing.

First thing we need to do is use the user inputs to determine our speed and whether or not we will clear the output directory.

We use 32 for fast and 2 for slow because 32 was the sweet spot I found to prevent being slowed by the API, your mileage may vary so play around with these if you think you can get more than 32 going at once.

If the user wants to delete all files in the output directory, we use glob to handle that aspect.

def pool_handler():
if speed_input == 'fast':
speed = 32
else:
speed = 2
if output_clean == 'y':
files = glob.glob('output/*')
for f in files:
os.remove(f)

Next in the pool_handler() function we will setup the multiprocessing and the tqdm progress tracking. When the script reaches this point it will start the multiprocessing using the get_street() function we will make after this step. We will also do a simple time tracking with a time.time() to get more precise seconds for the length of the multiprocessing to run.

 p = Pool(speed)
start_time = time.time()
rez = tqdm(p.imap(get_street, full_add_list), total=len(full_add_list))
tuple(rez)
end_time = time.time()
print(end_time - start_time)

The pool_handler() function should now look like this

def pool_handler():
if speed_input == 'fast':
speed = 32
else:
speed = 2
if output_clean == 'y':
files = glob.glob('output/*')
for f in files:
os.remove(f)
p = Pool(speed)
start_time = time.time()
rez = tqdm(p.imap(get_street, full_add_list), total=len(full_add_list))
tuple(rez)
end_time = time.time()
print(end_time - start_time)

Step 6: Define function for StreetView API

Alright we are almost done, just need to define this last function and make sure everything is in the right place.

The get_street() function takes one of the addresses from the full_add_list list and sends it to the Google API with your key and requests the image. The image gets saved as 123address.jpg to the save directory output created at the beginning.

def get_street(address):
save = 'output/'
base = "https://maps.googleapis.com/maps/api/streetview?size=1200x800&location="
MyUrl = base + urllib.parse.quote_plus(address) + key #added url encoding
fi = address + ".jpg"
urllib.request.urlretrieve(MyUrl, os.path.join(save, fi))

Great! Now your code should look like this:


import csv
import os
import urllib.parse
import urllib.request
import glob
import time
from multiprocessing import Pool
from tqdm import tqdm
from os.path import exists




if not exists('../k.txt') or not exists('data/address.csv'):
print('Files missing! Verify key file and input data locations.')
quit()

with open('../k.txt') as key_file:
key_txt = key_file.readline().strip('\n')

key = "&key=" + key_txt
full_add_list = []

with open('data/address.csv', mode='r') as address_csv:
reader = csv.reader(address_csv)
for row in reader:
full_add_list.append(row[0] + ', ' + row[1] + ', ' + row[2] + ' ' + row[3])

full_add_list.pop(0)

def get_street(address):
save = 'output/'
base = "https://maps.googleapis.com/maps/api/streetview?size=1200x800&location="
MyUrl = base + urllib.parse.quote_plus(address) + key #added url encoding
fi = address + ".jpg"
urllib.request.urlretrieve(MyUrl, os.path.join(save, fi))


def pool_handler():
if speed_input == 'fast':
speed = 32
else:
speed = 2
if output_clean == 'y':
files = glob.glob('output/*')
for f in files:
os.remove(f)
p = Pool(speed)
start_time = time.time()
rez = tqdm(p.imap(get_street, full_add_list), total=len(full_add_list))
tuple(rez)
end_time = time.time()
print(end_time - start_time)


speed_input = 'a'
while speed_input != 'fast' and speed_input != 'slow':
speed_input = input('Input program speed (fast) or (slow): ')
output_clean = 'b'
while output_clean != 'y' and output_clean != 'n':
output_clean = input('Delete all files in output directory(y) or (n)?')

pool_handler()

If you did everything correctly, you should have taken a CSV file of addresses and converted them into JPG images of those addresses StreetView’s.

FINAL NOTE:

Make sure to keep an eye on your use of the API, especially when using multiprocessing along side it like this.

--

--