How to load GeoJSON files into BigQuery GIS

An ETL pipeline for GeoJSON to BigQuery

Lak Lakshmanan
Google Cloud - Community
2 min readJul 17, 2019

--

While this tutorial is specifically for GeoJSON files for European administrative areas, I hope you will be able to use it as a guide for other sources of data. Do this in CloudShell or any machine where you have the Google Cloud SDK and Python3 installed.

To query polygon boundaries, load European administrative areas’ GeoJSON into BigQuery

First download the file you are interested in from the Eurostat website:

Then, unzip it to get the GeoJSON file of interest.

The GeoJSON file has a list of geometry features and some properties for each geometry. BigQuery requires new-line delimited JSON files where the geometry column is single string.

Here’s a Python program that will do the necessary conversion (note that the filename corresponds to the EPSG::4326 version — these are the ones with coordinates in longitude/latitude):

Here’s what a single line of the JSON file now looks like:

The resulting schema printed by the program is:

Now, load the new-line separated JSON into BigQuery (change the dataset and table name appropriately):

That’s it! Try out a query to order the polygons in terms of their distance from Frankfurt, Germany:

This returns:

Enjoy!

--

--

Lak Lakshmanan
Google Cloud - Community

articles are personal observations and not investment advice.