Creating choropleth maps in Altair

A Gordon
DataExplorations
Published in
4 min readDec 10, 2018

In an earlier post, I discussed how to generate choropleth maps in Folium and Tableau Public. Today I wanted to briefly walk through how to create them in Altair. There are a couple examples on the Altair website, but I found them a bit hard to abstract out to a more typical situation where you have a shape file and a dataframe to plot. Happily (isn’t the internet great?!), I found this article by Mark Sussman a big help in figuring out how to approach this.

Note: you will need Geopandas installed for this approach and, fair warning, it can be a bit tricky to get installed. I’m not certain of this, but my experiences so far suggest that it may help to install geopandas first and then Altair.

Create GeoPandas Dataframe

The first step is to create a GeoPandas Dataframe of the neighbourhood boundaries. The following code allows you choose whether to use a shape file or a geojson file, depending on what you have available:

geo_json_file_loc= '../data/toronto_crs84.geojson'
shape_file_loc = '../data/Neighbourhoods/NEIGHBORHOODS_WGS84.shp'
def open_geojson():
with open(geo_json_file_loc) as json_data:
d = json.load(json_data)
return d
def get_gpd_df(use_shape_file=True):
if use_shape_file:
gdf = gpd.read_file(shape_file_loc)
else:
toronto_json = open_geojson()
gdf = gpd.GeoDataFrame.from_features((toronto_json))
return gdf

This creates a geopanda data frame like the following:

The next step is to merge this with your data frame, assuming that your data has already been mapped to the neighbourhoods . If not, please check out my earlier post on how map latitude/longitude coordinates to an area in a shape file.

gdf = gdf.merge(df, left_on='AREA_NAME', right_on='neighbourhood', how='inner')

You then need to convert the geopandas dataframe to json and extract the features section (this what Altair will be plotting)

choro_json = json.loads(gdf.to_json())
choro_data = alt.Data(values=choro_json['features'])

The json data includes the following info for each row:

So when referencing this information in the plot, you’ll need to append “properties.”. For example, properties.count_buildings contains the number of buildings in each neighbourhood, which is the variable we’re going to use to colour in the map.

Create Altair Plot

Here is the function for generating the choropleth map in Altair

  • The Base layer creates the outlines of the neighbourhoods
  • The Choropleth layer colours in the neighbourhoods according to the value stored in the specified ‘color_column’
def gen_map(geodata, color_column, title, tooltip, color_scheme='bluegreen'):
'''
Generates Toronto neighbourhoods map with building count choropleth
'''

# Add Base Layer
base = alt.Chart(geodata, title = title).mark_geoshape(
stroke='black',
strokeWidth=1
).encode(
).properties(
width=800,
height=800
)
# Add Choropleth Layer
choro = alt.Chart(geodata).mark_geoshape(
fill='lightgray',
stroke='black'
).encode(
alt.Color(color_column,
type='quantitative',
scale=alt.Scale(scheme=color_scheme),
title = "Building Counts"),
tooltip=tooltip
)
return base + choro

To call this function, you pass in the choro_data generated above, specify the feature that will be used to colour in the areas(properties.count_buildings), specify a title, tooltip and colour scheme

toronto_map = gen_map(geodata=choro_data, color_column='properties.count_buildings', title=f'Location of {style} style houses', tooltip=['properties.AREA_NAME:O','properties.count_buildings:Q'], color_scheme='yelloworangered')
return toronto_map

This results in this plot:

Adding Interactivity

Of course, one of the main benefits of using Altair is the ability to easily add interactivity to your plots. We can accomplish this by

  • defining a selector
  • tying the color to the selector
  • adding the selection to the choropleth layer
def gen_map_cat2(geodata, color_column, title, tooltip, color_scheme='bluegreen'):selection = alt.selection_multi(fields=[color_column])color = alt.condition(selection,
alt.Color(color_column, type='nominal',
scale=alt.Scale(scheme=color_scheme)),
alt.value('lightgray'))
# Add Base Layer
base = alt.Chart(geodata, title = title).mark_geoshape(
stroke='black',
strokeWidth=1
).encode(
).properties(
width=600,
height=600
)
# Add Choropleth Layer
choro = alt.Chart(geodata).mark_geoshape(
fill='lightgray',
stroke='black'
).encode(
color=color,
tooltip=tooltip
).add_selection(
selection
)

return base + choro

In the resulting chart, you can click on any neighbourhood and only other neighbourhoods that share the same predominant style will be coloured, while all other areas will be grayed out

The interactive version of this chart can be found here: https://ag2816.github.io/predominant_styles_by_neighbourhood.html

The source code for this post can be found on my GitHub

Resources

--

--