Spatial Intersects with Geopandas

HP-Nunes
HP-Nunes
Jul 21, 2020 · 5 min read

A tutorial on how to intersect point features inside a polygon boundary in python geopandas

Intersection is one of the most commonplace geospatial analysis tool in GIS (Geographic Information Systems). The simplest intersect method is where various input geometric features (points, polygons, lines) overlap with one another, to derive the overlaid features as the output.

Source: https://pro.arcgis.com/en/pro-app/tool-reference/analysis/intersect.htm

Usually, this can be done simply in a GIS software such as QuantumGIS or Esri ArcGIS. However I am getting a bit jaded switching back-and-forth between programming environments and GIS GUIs, especially when working on relatively simple tasks.

And QGIS is crashing one too many times :(

Therefore, here’s a step-by-step example of how to intersect datapoints features within the boundaries of a polygon. For non-GIS folks out there, think of it as another tool in your wrangling tool-set, if you ever need to filter a dataframe based on a spatial overlap.

Step 1: Observing the spatial distribution of our points of interest

I’ve collected and wrangled together a dataset of measurements from air monitoring sensors for the city of Seattle (Washington) from PurpleAir. Or so I thought: the dataset I downloaded comprises of the full extent of the map displayed, therefore I’ve got datapoints that fall outside of the area of interest.

Although only interested in sensors located within Seattle, I pulled data well beyond the city’s border.

Looking at the city of Seattle’s boundaries on Google Maps (left, below), the datapoints I’m interested in should be mostly spread over a north-south axis, and not be dispersed too much over a west-east axis. This is something that can be roughly observed by plotting the longitude against the latitude on a scatterplot (right, below).

But I hear you: this is too rough of a visual approximation! Wouldn’t it be nice if, for instance, we were to transpose the city’s boundaries over to the scatterplot?

Step 2: Get boundary (polygon) data to intersect our points with

Where does one get data for the boundaries of places? Well, you’re in luck, since that tends to be one of the easiest spatial features to find. A site I’ve been using for many years is overpass-turbo, a nifty little tool to extract mapping features from OpenStreetMap (OSM), the “Wikipedia” of open-source GIS data. Simply navigate to your area of interest on the right-side of the interface, and click on ‘Wizard’ to define your features of interest. Using OSM Wiki, we know that U.S. administrative boundaries for cities are defined as “admin_level = 8”, which will return all administrative boundaries within the extent of the map on display. Again, we just want Seattle, so amend the Wizard query to “admin_level = 8 & name=Seattle”, and voilà:

You can also click directly on the polygon feature, which will show you all the associate tags, or parameters, associated with it.

Click on the ‘Export’ button and select ‘download as GeoJSON’:

Another common source for data pertaining to municipalities are government open-data warehouses such as the City of Seattle Open Data portal, which also provides an option to get city limits as a GeoJSON.

Step 3: Importing a GeoJSON into Pandas

import geopandas as gpdfname = “./qgis/outputs/seattle_B.geojson” # Your filepath herepoly = gpd.read_file(fname)print(type(poly))poly.head()
Note that the dataframe displayed here may look different from yours, since I got the GeoJSON from the city of Seattle’s open-data portal.

As you can see, the GeoJSON was imported as a geoDataFrame object (‘geopandas.geodataframe.GeoDataFrame’), since we used geopandas to import the file.

Step 4: Mapping the city boundaries on the scatterplot

Using matplotlib once more:

import matplotlib.pyplot as pltfig, ax = plt.subplots()ax = sns.scatterplot(x=”Longitude”, y=”Latitude”, data=df) 
# this is plotting the datapoints from the PurpleAir dataframe
poly.plot(ax=ax, facecolor=’none’, edgecolor=’purple’);
# plotting the city's boundaries here, with facecolor = none to
# remove the polygon's fill color
plt.tight_layout();
While most our datapoints fall within the polygon representing Seattle’s city limits, we still need to filter the out-of-bound points from the dataframe.

Neat! Now we can clearly observe where our datapoints are distributed relative to Seattle’s city borders. Therefore, we need to run an intersection so that only the points within the polygon are retained in our dataset.

Step 5: Intersect the datapoints with the city’s polygon

First, we need to convert the datapoint dataframe into a geoDataFrame, since we’ll be using a geopandas function (otherwise this will not work):

# Convert the purpleAir dataframe to a geodataframepoints_gdf = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(df.Longitude, df.Latitude))
type(points_gdf) # check here if the conversion worked# Important: you may need to reset the dataframe index if you had an index defined.

This part below brought to you by: read package documentation for smoother sailingᵀᴹ.

import shapely.speedupsshapely.speedups.enable() 
# enable shapely.speedups which makes queries running faster.
pip_mask = points_gdf.within(poly.loc[0, ‘geometry’])
# selecting the polygon's geometry field to filter out points that
# are not overlaid
pip_data = df.loc[pip_mask]

What we did here was to create a “mask” of the points which fell within the polygon, and used it to filter the main dataframe (df) using the .loc() function.

We map once more our datapoints with the city’s polygon:

import matplotlib.pyplot as pltfig, ax = plt.subplots()ax = sns.scatterplot(x=”Longitude”, y=”Latitude”, data=pip_data)poly.plot(ax=ax, facecolor=’none’, edgecolor=’purple’);plt.tight_layout();
Hooray, we retained the intersected datapoints!

Not a bad option if you’re looking to take a break from switching back-and-forth while wrangling your spatial dataset!

Added References:

Importing GeoJSON into pandas: https://ocefpaf.github.io/python4oceanographers/blog/2015/03/30/geo_pandas/

Geopandas documentation on intersecting features: https://automating-gis-processes.github.io/CSC18/lessons/L4/point-in-polygon.html

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

HP-Nunes

Written by

HP-Nunes

Aspiring data analyst with a background in GIS. Finishing a Master’s in Environmental Assessment on participatory air monitoring and Citizen Science.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

HP-Nunes

Written by

HP-Nunes

Aspiring data analyst with a background in GIS. Finishing a Master’s in Environmental Assessment on participatory air monitoring and Citizen Science.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store