Cobb Graph nodes ->road center points, edges -> road intersections, image: author

A Spatio-temporal Graph Neural Networks (STGNN) for Traffic Incident Prediction Using Arcpy and Pytorch Gemetric

Chunguang (Wayne) Zhang
7 min readDec 27, 2023

A unique graph deep learning approach for traffic incident prediction.

Introduction: Graph Neural Network (GNN) has gained attention for representing complex data structures, such as molecular networks and biological protein networks, chemical structures using nodes and edges, where nodes are embedded as low dimensional vector space to summarize every node’s position and the structure of its local neighborhood [1]. In my last blog of 2021, I explored the experimental study of a ConvLSTM model for next-day car crash prediction using the Cobb county map as a grid image.

In this blog, I intend to delve further into the exploration of spatial-temporal graph neural networks (STGNNs) for traffic crash forecasting. The term spatio-temporal graph has been often referred to a homogeneous graph of fixed topology, and node features that change over time at discrete time steps corresponding to sampled observations. In September 2020, DeepMind collaborated with Google Maps to unveiled a advanced GNNs model for improved ETA accuracy. Traffic networks inherently exhibit graph-like structures, making this approach particularly intuitive. The image below is sourced from Google DeepMind blog post[2].

Image by https://deepmind.google/discover/blog/traffic-prediction-with-advanced-graph-neural-networks/

Problem: Predicting traffic incidents like crashes within a spatial-temporal context is crucial for public safety and government authorities. However, this task is exceptionally challenging due to various contributing factors multitude of contributing factors and merging the heterogeneous data sources like human behavior, time, road geometry, weather, and further complicated by the sparse datasets. Traditional accident prediction methods, such as Poisson and Negative Binomial (NB) regression, often prove inadequate when faced with the intricacies of spatio-temporal correlations.

Solution: The proposed solution involves 1. Building a graph structure using ArcGIS Pro geoprocessing tools and Pytorch Geometric. 2. Adaptable A3T-GCN attention Graph Convolutional Neural Networks will be implemented for short-term crash prediction along county road segments within specified time frames, addressing questions of when, where, and how many accidents occur. This graph-based architecture can also be extended to other traffic studies, including estimated traffic speed, volume, and travel time.

Google ETAs: It’s natural to view road segments as graph edges and intersections as graph nodes. However, intersections lack the detailed feature information discovered in road lines, which is essential for an accurate road network representation. Google’s travel time graph architecture provides an effective approach to modeling the road network and its dynamics. By replacing graph nodes with road segments, we can create graph edges to connect these segments, as illustrated in fig 2. This allows for the aggregation of informative messages between road segments as graph nodes facilitating the information flow throughout the road network. In addition, leveraging the learning bias of convolutional and recurrent neural networks further assist to capture the spatio-temporal relationships among road segment links to predict traffic incidents occurring adjacent and interesting roads.

Graph Informative Messages: The graph itself can be denoted as G(V,E), where V is the set of nodes and E is the set of edges. The adjacency matrix A can be employed to represent connections between nodes. If Avu​=1, it signifies an edge between nodes v and u; otherwise, Avu​=0. Graph neural network (GNN) operations can be denoted as H(l+1)=f(H(l),A), where H(l) represents node features at layer l, and f denotes the aggregation function. Graph Convolutional Networks (GCNs) leverage pairwise message passing, allowing graph nodes to iteratively update their representations. In the short-term crash prediction model of deep learning practice, various factors like traffic conditions, weather, road geometry, and driver characteristics are engineered as node features in the graph model.

Data Sources: The analysis utilizes Cobb County road crash data from 2018 to 2021. Bidirectional road networks are removed and consolidated into single directions for efficient graph learning representation. Fig 1.

Fig. 1 single direction Cobb road networks. image: author

Graph Building Process: Several ESRI ArcGIS Pro geoprocessing tools and Arcpy are employed to construct the nodes and edges of the Cobb County road network. A total of 246 nodes and 564 road edges are constructed.

1. Dissolve roads by street name:

  • Use ArcGIS Pro DISSOLVE on the Road Arterial layer with the attribute Street Name and option UNSPLIT_LINES.
  • Result layer: ROAD_ARTERIAL_DISSOLVE.
image source, https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/dissolve.htm

2. Generate lines at intersections:

  • Use FEATURE TO LINE to create lines generated by splitting at intersections.
  • Result layer: ROAD_ARTERIAL_FEATURETOLINE.
image source, https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/feature-to-line.htm

3. Create road intersection points:

  • Use INTERSECT tool on ROAD_ARTERIAL_FEATURETOLINE to generate points at intersections.
  • Result layer: ROAD_ARTERIAL_INTERSECTIONS.
Intersection points. image: author

4. Create road center line points as graph nodes:

  • Use GENERATE POINTS ALONG LINE on ROAD_ARTERIAL_FEATURETOLINE with a percentage of 50% to create ROAD_ARTERAL_MIDPT
  • This feature serves as graph nodes, transferring road attributes with the same road IDs

5. Spatially join intersections and roads:

  • Use SPATIAL JOIN with a one-to-many option to join ROAD_ARTERIAl_INTERSECTIONS and ROAD_ARTERIAL_FEATURETOLINE.
  • Result layer: NODES_SPATIAL_ARTERIAL.

This process establishes connections between intersection points and roads polylines,

6. Create edge_index for Graph:

  • Create an edgeIndex, a 2D tensor defining source and target nodes of all edges, represented as e = uv in a graph.
  • This edgeIndex is crucial for graph representation in PyTorch Geometric.

result is a 2D tensor:

tensor=([[ 1, 1, 2, …, 241, 241, 242], [ 2, 3, 3, …, 242, 243, 243]], dtype=torch.int32)

7. Create edgeLink feature lines for Graph:

  • This line feature class is based on the index to link all the ROAD_ARTERAL_MIDPT as line feature.
Fig 2. Edge link feature lines (green) connected road center line points. image: author
Fig. 3. Cobb County Road Network Graph with Nodes and Edges, image: author

Apply Graph GCNN Model: With the county road network treated as a fixed graph built as above processes, each road segment corresponds to a node, and edges connect consecutive segments or intersections . Fig 3. In addition, each node contains X=6 features of road characteristics. We can now experiment the A3T-GCN (Attention Temporal Graph Convolutional Network) model for the training and prediction of next-day or week crashes along road segments. A3T-GCN is an extension of the TGCN model by adding an attention mechanism to re-weight the influence of historical crash states and thus to capture the global variation trends of crashes on the road. TGCN was a temporal GCN model constructed by combining GCN and GRU. The details of the model can be found in the paper.

image source: arXiv:2006.11583

I used seven day crash sequence data as inputs of the model. The data dimensions as (1096, 245,6, 7) is batched into the model with 80% sequences for training and 20% sequences for testing. Each sequence is shifted one day with total of 1096 sequences. The implementation is credited to the pytorch geometric temporal [4] source codes on GitHub. The results and codes are on my GitHub as well.

Dataset loader by author
GNN model. image: author
One epoch training with loss RMSE:0.3124. image: author
A typical one day prediction map. image: author

Why it matters: This empirical study shows how we can use graph model such as GCNN, a deep learning techniques to find a promising solution for crash prediction. The workflows and data processes can be further enhanced with more engineering features and data, automated, deployed as services. As a result, the same methodology can be applied to spatio-temporal solutions for traffic flow, rainfall, crime prediction and more you can think of benefitting of humanity.

The world today presents us with many challenges. However, with the power of AI and deep learning, we can work together to seek solutions and make the world a much better place to live. Thank you for reading this blog.

References:

[1] Leskovec, Representation learning on graphs: methods and applications, IEEE Data Engineering Bulletin 40 (3) (2017) 52–74.

[2] ETA Prediction with Graph Neural Networks in Google Maps, https://arxiv.org/abs/2108.11482

[3] Li, Yaguang and Yu, Rose and Shahabi, Cyrus and Liu, Yan, Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting, International Conference on Learning Representations (ICLR ‘18)

[4] Benedek Rozemberczki and Paul Scherer and Yixuan He and George Panagopoulos and Alexander Riedel and Maria Astefanoaei and Oliver Kiss and Ferenc Beres and Guzman Lopez and Nicolas Collignon and Rik Sarkar, PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models, Proceedings of the 30th ACM International Conference on Information and Knowledge Management, 4564–4573, 2021

[5] Jiawei Zhu, Yujiao Song, Ling Zhao, Haifeng Li, A3T-GCN: Attention Temporal Graph Convolutional Network for Traffic Forecasting,
arXiv:2006.11583

[6] Graph neural network — Wikipedia

--

--

Chunguang (Wayne) Zhang

Information Business Analyst in Cobb County. Microsoft Certified Data Scientist