Sep 10 · 8 min read

Jianting Zhang, Shuo Wang, Thomson Comer, Josh Patterson, Keith Kraus, Mark Harris, Sujit Biswas

Figure 1: Examples of Spatial Data: trajectories at intersections, LIDAR data, heatmap of shopping behavior in a retail store

The Internet of Things (IOT) has spawned explosive growth in sensor data. Location is some of the most important information generated by sensors, and dynamic location is vital in the case of mobile sensors. Examples include: mobile phones (GPS), vehicles, robots, and cameras.

Applying artificial intelligence (AI) to IOT in this case implies understanding dynamic sensor data in the context of the operating environment and using this understanding to optimize system performance, improve navigation and flow, detect anomalies, and monitor the environment. The critical insights AI-IOT can provide for positioning, navigation, behavior, and environmental understanding impact multiple domains, including smart cities, smart traffic and transportation, warehouse logistics, retail, safety, and security. However, these insights can only be obtained by interpreting dynamic sensor data in the context of the more semi-static environment data typically provided by geographic information systems (GIS) and building information management systems (BIM). We created cuSpatial to accelerate common operations needed in understanding sensor data with GIS information.

What is cuSpatial?

cuSpatial is an efficient C++ library accelerated on GPUs using NVIDIA CUDA and cuDF, the RAPIDS DataFrame library. It includes Python bindings for use by the data science community. cuSpatial provides significant GPU-acceleration to common spatial and spatiotemporal operations such as point-in-polygon tests, distances between trajectories, and trajectory clustering. Speed-ups in comparison with CPU-based implementations range from 10x to 10,000x depending on the operation. Figure 2 shows typical operations in spatial and spatiotemporal data management.

Figure 2: Spatial Data Management.

Table 1 illustrates the layers of cuSpatial, indicating how higher-level applications can leverage cuSpatial functionality.

Table 1: cuSpatial Layers and functionality.

cuSpatial focuses on the bottom four layers of Table 1: geo-representation, geo-operations, indexing and query. The top layers in Table 1 combine cuSpatial functionality with graph functionality to create higher-level spatial and spatiotemporal trajectory analysis.

Figure 3 shows the cuSpatial technology stack, and how it integrates with Python, cuDF, and CUDA.

Figure 3: The cuSpatial technology stack.

cuSpatial C++ and Python APIs operate on the core cuDF data types Series and DataFrame in Python. cuSpatial data is laid out as columns (columnar design, top part of Figure 3) in GPU memory for fast data processing using the GPU’s massive data parallelism and high memory bandwidth.

cuSpatial is designed to work seamlessly with cuDF in end-to-end application workflows. For example, cuDF ETL (Extract-Transform-Load) operations, such as filtering and joining, can be used to clean and preprocess geospatial data based on non-geometric attributes before feeding it into cuSpatial for geospatial operations. Correspondingly, geospatial operations can be used to derive new relational data for further cuDF processing, such as distances and speeds of trajectories that are derived from a large set of unordered and timestamped point locations of objects.

Object locations can be detected using a number of sensors such as video cameras. Intelligent video analytics applied to camera data leads to the sensing of object and event locations within the image frame of reference. This frame of reference need to be converted into the geospatial frame of reference for downstream GIS-based spatial analytics and visualization.

cuDF and cuSpatial together provide access to data in a rich set of formats, including both relational (CSV, Parquet, etc.) and geospatial/GIS, such as shapefiles. More importantly, ingested data can flow from Python data structures on CPUs (including Numpy arrays and data frames) to cuDF/cuSpatial data structures on GPUs in an automatic, transparent and efficient way.

cuSpatial Release v0.10 Functionality

The first release of cuSpatial includes the following functionality.

  1. Columnar data representation for points, lines, polygons
  2. Location data ingestion from sample JSON, csv and binary format
  3. Spatial window query
  4. Point-in-polygon test
  5. Converting latitude and longitude to x, y coordinates.
  6. Haversine distance between two pairs of latitude and longitude points
  7. Pairwise Hausdorff distances between point sets (e.g, trajectories)
  8. Deriving trajectories from sequences of locations and times
  9. Computing trajectory distance and speed
  10. Computing trajectory spatial bounding boxes

It also includes:

  1. Python bindings and tests for all the above features
  2. Sample applications & performance evaluation scripts

cuSpatial Performance

Table 2 illustrates the performance of cuSpatial on the most significant operations implemented.

Table 2: Speedup is seen across all operations and also across different kinds of GPUs.

Clustering Trajectories: A sample Application Using cuSpatial

cuSpatial can be used to speed up many data analytics and machine learning problems that require spatial and spatiotemporal computations. We implemented an example that accelerates a trajectory clustering problem with cuSpatial on a real-world vehicle trajectory dataset. The dataset contains more than 10,000 trajectories extracted through traffic camera video analytics. In this particular use case, cuSpatial reduces end-to-end computation time of the clustering task from over 5 hours to under 15 seconds. The test machine has an Intel Core i7–6700K CPU and NVIDIA Quadro GV100 GPU.

Figure 4: Trajectory clustering results using agglomerative clustering using cuSpatial for directed Hausdorff distance computation.

The trajectory clustering task creates groups of trajectories that are more similar to each other than to trajectories in other groups. Clustering is useful for problems such as motion pattern extraction, behavior analysis and more. A clustering task consists of two main components: a similarity metric and a search algorithm.

Given a specific similarity metric, different clustering algorithms have different searching mechanisms and therefore different complexity. In some cases data scientists may want to precompute all pairwise similarities, for example if they need to perform multiple iterations of hyperparameter search for the clustering algorithm to achieve a good result. It would be inefficient to repeat the time-consuming similarity computation each time.

We use the Hausdorff distance as the similarity metric between trajectories. The Hausdorff distance between two trajectories is the maximum distance of the first trajectory to the nearest point in the second trajectory.

We used trajectories collected from a traffic intersection located in a mid-sized city in the United States. We applied intelligent video analytics techniques to extract real-world vehicle trajectories from multiple traffic cameras. This dataset is part of a city-wide traffic video database which is developed to support various smart city and smart traffic research tasks (see AI City Challenge for more details).

The Python notebook for this code can be found in the RAPIDS notebooks repository on Github.

The data is loaded into a list of Numpy arrays, trajectories. Each Numpy array in this list represents one trajectory with shape (trajectory_length, 2).

print(max([len(traj) for traj in trajectories]))
print(min([len(traj) for traj in trajectories]))
print(np.mean([len(traj) for traj in trajectories]))


<class ‘list’>
<class ‘numpy.ndarray’>
(67, 2)

We create an empty Numpy array dmatrix, and our goal is to fill it with the Hausdorff distance between all trajectory pairs.


We first try to calculate the symmetric Hausdorff distance using


We loop through to fill up dmatrix.

from scipy.spatial.distance import directed_hausdorff

for i in range(dmatrix.shape[0]):
for j in range(dmatrix.shape[1]):
dmatrix[i][j] = directed_hausdorff(trajectories[i], trajectories[j])[0]

While this might be a natural way for a data scientist to solve this problem, it uses a single CPU thread and it takes about 5 hours to finish on an Intel Core i7–6700K CPU. Even though we have chosen to precompute the similarity matrix and only calculate it once, it could be discouraging to wait for hours before starting the actual clustering exploration. Imagine the case where we need to recompute the similarity matrix for each clustering hyperparameter search iteration; speeding up the similarity computation brings huge benefits because it makes this analytical task interactive.

We can accelerate this using


This cuSpatial function takes as arguments three cuDF Series objects. The first two contain the x- and y-coordinates for all points in the trajectory dataset, and the third contains the length of each trajectory.

It returns the pairwise symmetric hausdorff distance in another cuDF Series object.

Here we need to first convert the trajectories into the right format:

import cudf 
cnt = []
for traj in trajectories:
cnt = np.asarray(cnt)
trajs = np.concatenate([np.asarray(traj) for traj in trajectories], 0)
x_s = trajs[:,0]
y_s = trajs[:,1]
pnt_x = cudf.Series(x_s)
pnt_y = cudf.Series(y_s)
cnt = cudf.Series(cnt)

Then we use cuSpatial for Hausdorff distance computation.

dist = cuspatial.directed_hausdorff_distance(pnt_x,pnt_y,cnt)

Finally we convert the result back to a Numpy array.

urt =
dmatrix = np.reshape(urt,(len(cnt),len(cnt)))

On an NVIDIA Quadro GV100,


only takes 13.38 seconds to finish computing the Hausdorff distance for all trajectory pairs. Including the data transforming and transferring time, cuSpatial takes 13.54 seconds in total. Instead of waiting for hours, now we can jump directly into the clustering exploration.

In this example we apply AgglomerativeClustering and DBSCAN from scikit-learn. The clustering results are visualized in an interactive way using ipython widgets, allowing the user to play with the hyperparameter set and see how the clustering results respond. Figure 4 shows a sample result for AgglomerativeClustering , and Figure 5 shows a sample result for DBSCAN.

Figure 5: Trajectory clustering results using DBSCAN using cuSpatial for directed Hausdorff distance computation.

Try Out cuSpatial Today

cuSpatial aims to provide high performance and ease of use for large-scale geospatial and spatiotemporal processing. Existing commercial GIS systems can leverage cuSpatial to accelerate their performance and significantly improve end user experience. Existing commercial databases can leverage cuSpatial to extend coverage for spatial data management.

cuSpatial is pre-release, open source software. The first official release will be part of the RAPIDS 0.10 release in October, 2019. You can try it out today by installing the cuSpatial conda packages using this command:

conda install -c rapidsai-nightly cuspatial

Or you can build from source from the cuSpatial Github repo. If you like cuSpatial, please star the repo on GitHub! Here are some example Python notebooks to help get you started.

About the Authors

Dr. Jianting Zhang is an Associate Professor in Geographical Information Systems and Computer Science at the City College of New York. He is also a visiting faculty member at NVIDIA

Dr. Milind Naphade is the Chief Technology Officer for Intelligent Video Analytics and AI Cities at NVIDIA

Mark Harris is a Principal System Software Engineer in the RAPIDS team at NVIDIA

Sujit Biswas is a Principal Engineer in the Intelligent Video Analytics team at NVIDIA

Thomson Comer is a Senior System Software Engineer in the RAPIDS team at NVIDIA

Dr. Shuo Wang is a Data Scientist in the Intelligent Video Analytics team at NVIDIA

Keith Kraus is a System Software Engineering Manager in the RAPIDS team at NVIDIA

Joshua Patterson is the General Manager of Data Science at NVIDIA


RAPIDS is a suite of software libraries for executing end-to-end data science & analytics pipelines entirely on GPUs.


Written by


Dr. Milind Naphade is the chief technology officer of AI Cities at NVIDIA, where he leads technology strategy for the company’s Metropolis platform.



RAPIDS is a suite of software libraries for executing end-to-end data science & analytics pipelines entirely on GPUs.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade