Automate the data processing using python with pandas
Produce a custom map using Google Maps Javascript API — Part 2
This is follow up with the previous work, while last time I mentioned we can simply use Microsoft Excel to generate the data. However, we got around 40 csv files and I don’t want to do the manual work one by one using Excel, so I decided to use python to help generate the necessary data for me to use.
Previous Read
Purpose
I would like to produce the from the UK police data to turn from the CSV to a javascript object for the Google Map.
Example CSV format
Example output format
Estimated time to complete: within 15 minutes
Install python and pandas
I am using Linux for my work, usually python is pre-installed, in case you are using Windows / MacOS, you may go to https://www.python.org/downloads/ to download the python and install.
We use pip to install the pandas
pip install pandas
Import the pandas library and os library to manipulate the file operations
import os
import pandas as pd
Walk through the folder and scan all the input csv
path = '/home/skyau2/ukcrime/2020-06/'
outpath = '/home/skyau2/ukcrime/outdata/'for r, d, f in os.walk(path):
for file in f:
files.append(os.path.join(r, file))
For each csv, we use pandas to read it
df = pd.DataFrame()
for f in files:
df = pd.read_csv(f)
We can have some filter by using pandas, example, I would like to filter the crime type = Violence and sexual offences and remove the data without the location info
df = df[df['Crime type'] == 'Violence and sexual offences']
df = df[df.Longitude.notnull()]
df = df[df.Latitude.notnull()]
Open the file for write, loop through the data in the dataframe and produce the output
fo = open(jsfile, "w")
fo.write('var locations = [')
for row in df.itertuples(index=True):
if row.Index == df.index[-1]:
fo.write('{lat: '+ str(row.Latitude)+ ', lng: '+ str(row.Longitude)+ '}]')
else:
fo.write('{lat: '+ str(row.Latitude)+ ', lng: '+ str(row.Longitude)+ '},')
Complete!
Full Program
Sample output