Mapping a Google Sheet with Lambda and Mapbox

A spreadsheet is still an easy, accessible way to collect and organize information. It’s not a great way to visualize that data, especially if it’s location data. A map is much better, but setting up an online map requires coordinates and JavaScript, not orderly rows and columns of English data. Fortunately, both spreadsheets and online maps have APIs, and it’s simple to connect them with some configuration and a bit of code.

Connecting the dots

March On is collecting, organizing, displaying event data for more than two dozen affiliate chapters and over a hundred events around the world. They maintain this info in Google Sheets, and display it on maps on marchonthepolls.org and www.wearemarchon.org. The glue between the spreadsheet and map is a bit of Python code running in the cloud. Here’s how it works.

  • March On staffers add rows to their Google sheet throughout the day.
  • Twice daily, an AWS Lambda function collects event data from a Google Sheet and merges it with an S3-hosted GeoJSON file.
  • The March On affiliate sheet includes a filename referencing a Google Drive photo. The Lambda function fetches the file, resizes it, and uploads it S3 for delivery to browsers.
  • Mapbox’s Geocoding API translates between human-readable place names and map coordinates.
  • In the browser, Mapbox GL JS draws a beautiful map from a GeoJSON feature collection.

Setting up an AWS Lambda function

There are two main parts to setting up a Lambda function to run these steps regularly: creating a deployment package and configuring the schedule.

For a Python Lambda function, a deployment package is just a zip file containing the script and all of its dependencies. This script uses a package with platform-specific C dependencies, so it was important to build these on the same platform they’d run on. Thanks to How to claw your way out of AWS Lambda function hell using the power of Docker, I found I could do this on my Mac command line.

Next, I set up a CloudWatch Events schedule to run the Lambda twice a day.

For ongoing updates, I wrote a quick script to update the deployment package zip with the latest Python script, then update the function code with aws-cli:

aws lambda update-function-code --function-name marchon-events-geojson --region us-east-1 --zip-file fileb://linux-lambda.zip

Get sheet data

March On’s Google sheet contains data on events, including a location, URL and contact info. Google Drive has an API which loads spreadsheet rows into a Python list with just a couple of lines of code:

result = service.spreadsheets().values().get(
spreadsheetId=os.environ['SHEET_ID'],
range=sheet_range).execute()
values = result.get('values', [])

Geocode locations

Newly added locations have only a human-entered location (ie San Jose, California). The Mapbox Geocoder API translates these names to coordinates. It also returns a relevance score, which is useful for identifying and ignoring poor matches.

response = geocoder.forward(location, limit=1,        
country=countries).geojson()
if 'features' in response and response['features']:
feature = response['features'][0]
if feature['relevance'] < 0.75:
log.warning('WARNING: error geocoding %s’, key)
continue
sheet[key]['geometry'] = response['features'][0]['geometry']

The coordinates returned by the Geocoder map directly to a GeoJSON geometry attribute:

"geometry": {
"type": "Point",
"coordinates": [ -121.8261, 37.2751 ]
}

Once the data from the sheet is merged with the existing GeoJSON, one function call updates the S3-hosted GeoJSON file:

boto3.resource('s3').Object('ragtag-marchon', filename).put(
Body=json.dumps(data, indent=2),
ContentType='application/json',
ACL='public-read',
Expires=(datetime.now() + timedelta(hours=6))))
</pre>

Drawing a map

The Mapbox GL library can load data directly from a GeoJSON URL and use that data to draw a map. But simpler is not always better: due to some behind-the-scenes performance optimizations, data loaded via addSource can’t be accessed by other parts of the app. Instead, the app uses pegasus.js to load the GeoJSON data, then manages the data itself. The app uses the data in several ways: it adds the data as a geojson source, adds that source as a map layer, and uses feature properties as overlay content.

map.on('load', function() {
geojson.then(function(data) {
const affiliate = {
type: 'FeatureCollection',
features: _.filter(features, function(feature) {
return feature.properties.affiliate;
}),
};
map.addSource('marchon-affiliate-geojson', {
type: 'geojson',
data: affiliate,
});
map.addLayer({
id: 'marchon-affiliate-geojson',
type: 'symbol',
source: 'marchon-affiliate-geojson',
layout: {
visibility: 'visible',
'icon-image': 'star-15-red',
'icon-allow-overlap': true,
'text-allow-overlap': true,
},
});
});

The Vue.js library provides a simple framework to enable interactivity, including as highlighting the nearest features to the user’s location and showing map popups on mouseover or click.

Adding photos to the map

March On maintains a Google Drive folder of photos from affiliate chapters. The map displays these as additional context information in a sidebar (desktop) or popup (mobile). The crowd-sourced photos vary widely in size, and can’t be linked to directly from browsers.

To make them available on a map, the Lambda function fetches them from the Drive folder using the Drive API:

files = service.files().list(pageSize=100, q=query).execute()['files']
for photo in file_list:
key = photos.get(photo['name'], None)

dataset[key]['properties']['photoUrl'] = 'https://s3.amazonaws.com/ragtag-marchon/%s' % resize_photo(service, photo)

The Pillow library makes it simple to resize the photos to a consistent small size, and the AWS Boto library requires just one function call to upload the file to S3:

data = service.files().get_media(fileId=file['id']).execute()
img = Image.open(io.BytesIO(data))
resized = img.resize((width, height), Image.ANTIALIAS)
img_bytes = io.BytesIO()
resized.save(img_bytes, format=file_ext.upper())
img_bytes.seek(0)

s3 = boto3.resource('s3')
s3.Object('ragtag-marchon', filename).put(
Body=img_bytes.read(),
ContentType=file['mimeType'],
ACL='public-read',
Expires=(datetime.now() + timedelta(hours=24 * 7))))

Make your own

A bit of cloud-hosted Python and client-side JavaScript makes data contained spreadsheet rows come alive on an interactive map. This map was created as a volunteer project with Ragtag. To make your own, check out the code from Github at: https://github.com/RagtagOpen/marchon-map