cuSpatial 23.04

GeoSeries, Intersection, Spatial Predicates, and more!

Alex Ravenel
RAPIDS AI
6 min readMay 23, 2023

--

Authors: Thomson Comer, Mark Harris, Ben Jarmak, Michael Wang

cuSpatial continues barreling towards complete spatial relationship predicate (DE-9IM) support and Cartesian distance between any two geometries. In support of these goals cuSpatial 23.04 includes multiple new features, as well as a new notebook that shows off the capabilities of cuSpatial. There are also some breaking changes in this release to ensure all features in the library meet our high standards.

New Features

  • GeoSeries Standardization
  • Linestring-linestring Intersection
  • Point-Polygon Distance Computation
  • Binary Predicates: Equals and Intersects
  • Stop Sign Counting Notebook

GeoSeries Standardization

Past cuSpatial releases greatly enhanced the capabilities of cuSpatial’s geometry series object: GeoSeries. In this release, cuSpatial standardizes all API inputs with GeoSeries. Instead of passing custom offset and coordinate arrays like before, all cuSpatial APIs now accept GeoSeries as input. Constructing a cuSpatial GeoSeries from GeoPandas is as simple as:

import geopandas as gpd
gpds1 = gpd.GeoSeries([Point(0, 0), Point(1, 1)])
cus1 = cuspatial.GeoSeries(gpds)

You can also directly construct a GeoSeries from a list of shapely objects:

cus2 = cuspatial.GeoSeries([Point(2, 2), Point(3, 3)])

Using GeoSeries everywhere significantly simplifies cuSpatial API calls.

# cuspatial 23.02
old = cuspatial.haversine_distance(p1_lon, p1_lat, p2_lon, p2_lat)
# cuspatial 23.04
new = cuspatial.haversine_distance(p1, p2)

Interleaved x-y coordinates v.s. separate x-y coordinate column

GeoSeries assumes that its underlying coordinates are stored in a single interleaved x-y column. For data stored in separate x-y columns, cuSpatial provides the following new factory methods to construct a GeoSeries.

GeoSeries.from_points_xy
GeoSeries.from_multipoints_xy
GeoSeries.from_linestrings_xy
GeoSeries.from_polygons_xy

To construct a GeoSeries from separate coordinate columns, first call interleave_columns() from cuDF to construct an interleaved coordinate column, then construct the geoseries with from_points_xy.

coords = cudf.DataFrame({“x”: xcol, “y”: ycol}).interleave_columns()
points = GeoSeries.from_points_xy(coords)

Updating the API to accept GeoSeries input is a major breaking change. It’s a necessary change since there are many GIS data formats out there, while libcuspatial computation kernels have strong assumptions on inputs. By separating the data structure construction and computation, this design makes cuSpatial computation more efficient and robust.

Linestring-Linestring Intersection

Set operations allow users to identify and analyze overlapping spatial features, which is critical for tasks like route optimization, resource allocation, and conflict resolution in infrastructure planning. cuSpatial 23.04 introduces an API that computes the intersecting geometries between two columns of linestrings. The API can be called via cuspatial.pairwise_linestring_intersection. The following example shows how to use it.

Image 1: Example of two linestring intersects on a point and a segment.
Image 1: Example of two linestring intersects on a point and a segment.

Given two linestrings (ABCDE, FGHI), compute the intersections between them:

>>> from shapely.geometry import *
>>> from cuspatial.core.binops.intersection import pairwise_linestring_intersection
>>> l1 = cuspatial.GeoSeries([
… LineString([(0, 0), (2, 1), (4, 3), (2, 5), (0, 3)])])
>>> l2 = cuspatial.GeoSeries([
… LineString([(-1, 6), (2, 3), (3, 4), (1, 6)])])
>>> offsets, geoms, lookbacks = pairwise_linestring_intersection(l1, l2)
>>> offsets
<cudf.core.column.numerical.NumericalColumn object at 0x7f097c987bc0>
[
0,
2
]
dtype: int32
>>> geoms
0 POINT (1.00000 4.00000)
1 LINESTRING (2.00000 5.00000, 3.00000 4.00000)
dtype: geometry
>>> lookbacks
lhs_linestring_id lhs_segment_id rhs_linestring_id rhs_segment_id
0 [0, 0] [3, 2] [0, 0] [0, 2]

The API returns a 3-tuple:

  • offsets, a cuDF integer series, which includes the offsets to the first result geometries for each pair;
  • geoms, a GeoSeries that contains the intersection results (points and linestrings);
  • lookbacks, a 4-column dataframe that contains the source geometry IDs from which to search the result.

In this example, the offsets result of [0, 2] indicates that the intersection for the 0th pair starts at index 0 of the geoseries, and it has two results.

The geoseries (geoms), shows the result contains two values:

  1. A point with coordinates (1, 4).
  2. This is the point of intersection between the linestrings (D, E) and (F, G) in the image above.
  3. A linestring with coordinates (2, 5)->(3, 4).
  4. This is the intersecting segment (D, H) in the image above.

The 0th element in the 0th row of the lookback ids (lookbacks) indicates the position of the first result geometry in the first pair. From the example above, the first value — which corresponds with the first geometry in geoms — for each element in lookbacks is (0, 3, 0, 0).

(0, 3, 0, 0) means that POINT (1, 4) (the first (0) geometry returned in geoms) comes from the first (0) linestring (ABCDE), fourth (3) segment (2, 5) -> (0, 3) on the left hand side and first (0) linestring (FGHI), first (0) segment (-1, 6)->(2, 3) on the right hand side.

There is no multilinestring in the input, therefore all linestring lookback ids are 0.

We recognize that this adds complexity, but it’s because intersection is a powerful primitive. In a sequential API that takes a single pair of geometries for each intersection test, the calling code knows immediately which input geometries intersect. A parallel API that operates on arrays of data must provide details of which geometries intersect so that the caller can act on the results. We can’t wait to see what you create with pairwise_linestring_intersection!

Point-Polygon Distance Computation

Great strides were made to cuSpatial’s suite of Cartesian Distance computations (known as ST_Distance in PostGIS/MySQL/SpatiaLite parlance) in this release with the addition of point-polygon distance computation. To use this new feature, call the cuspatial.pairwise_point_polygon_distance API.

This example shows two pairs of inputs: Point E and Polygon ABC, and MultiPoint G, F and Polygon BCD. Use the API to compute the distances as in the following code.

>>> import cuspatial
>>> from shapely.geometry import *
>>> lhs = cuspatial.GeoSeries([MultiPoint([(0, 2)]), MultiPoint([(2 ,1), (2, 3)])])
>>> rhs = cuspatial.GeoSeries([Polygon([(0, 0), (2, 0), (1, 2), (0, 0)]),
… Polygon([(2, 0), (3, 2), (1, 2), (2, 0)])])
>>> cuspatial.pairwise_point_polygon_distance(lhs, rhs)
0 0.894427
1 0.000000
dtype: float64

Note that the second pair in the series contains a multipoint, where one of the points intersects with the polygon in the corresponding pair, leading to a distance of 0.

Binary Predicates: Equals and Intersects

cuSpatial 23.04 expands support for topological binary predicates by adding those that depend on point-to-point equality and/or intersection. This capability enables a large set of predicates for various geometry combinations.

points.geom_equals(points)
linestrings.geom_equals(linestrings)
polygons.geom_equals(linestrings)
multipoints.geom_equals(multipoints)

points.contains(points)
points.covers(points)
points.intersects(points)
points.within(points)
points.crosses(points)
points.overlaps(points)
points.intersects(points)

linestrings.intersects(points)
linestrings.intersects(multipoints)
linestrings.intersects(linestrings)

points.disjoint(linestring)
linestrings.disjoint(points)
linestrings.disjoint(linestrings)

linestrings.contains(points)
linestrings.covers(points
linestrings.crosses(points)
linestrings.crosses(linestrings)
linestrings.crosses(polygons)
linestrings.overlaps(points)
linestrings.overlaps(linestrings)
linestrings.overlaps(polygons)

The next release of cuSpatial aims for complete spatial predicate support.

Stop Sign Counting Notebook

cuSpatial adds a new notebook demonstrating spatial join capability with the point-in-polygon function. This notebook includes a custom QuadTree data structure to allow fast spatial joins on the GPU in just a few lines of code, for example:

# Build a quadtree with all stop signs in the US
stop_quadtree = QuadTree(d_stops, x_column=’x’, y_column=’y’)

# Pass CA zip code boundary polygons
stop_quadtree.set_polygon(CA_zipcode, poly_column=”WKT”)

# Join the stop signs and the zip code dataframe
stop_by_zipcode = stop_quadtree.point_left_join_polygon([“x”, “y”], [“ZCTA5CE10”])

This filters all stop signs that exist in every zip code in California. Finally, the notebook visualizes the count of the zipcodes on a map. Check out the notebook to see and interact with the map!

Other Improvements and Bug Fixes

In addition to the major features above, cuSpatial 23.04 includes a number of other improvements and bug fixes. The refactoring of cuSpatial’s C++ implementation is complete after about a year of effort, with a followup in 23.06 to improve header and source reorganization. Libcuspatial now comprises two layers: the generic header-only API where algorithms are implemented, and the cuDF column-based API which enables interoperability with libcudf.

The GDAL dependency has been removed, making the installation lighter weight, and paving the way to PIP installation support.

Documentation continues to improve, with updates and corrections to multiple sections of the User Guide.

Breaking Changes

In addition to changes to APIs to support GeoSeries, there are two significant API removals. The cubic spline interpolation function was not widely used and uncovered bugs in dependencies, limiting usability. This API also did not mesh well with the cuSpatial feature set, so it has been removed. The very limited shapefile reader has also been removed because of its heavy dependency on GDAL, limited parallelism, and limited feature set (only polygons and no other geometry types could be loaded). We plan to begin adding more capable and higher performance geospatial data I/O in a near future release.

Try cuSpatial 23.04 Today

Experience the new features and high performance of cuSpatial 23.04 and take your spatial analysis to new places. With the latest release, we have introduced several exciting features and enhancements that will revolutionize your spatial relationship computations. Try out the new features and new notebooks. Join our growing community of developers and researchers leveraging cuSpatial’s capabilities for efficient and robust spatial analytics.

cuSpatial User Guide

cuSpatial Discussions Forum

cuSpatial Github

--

--