Industrial Renaissance Tracker (IRT) Update: Insights into Investment and Logistic Clusters
Robust domestic supply chains have become a paramount priority amidst rising economic nationalism and trade tensions. While the United States benefits from a strong global trading network, the pandemic revealed vulnerabilities tied to over reliance on international supply chains. In this latest edition of AlphaGeo’s Industrial Renaissance Tracker (IRT) analysis, we explore the relationship between transportation networks and investment inflow to test the hypothesis that robust transportation infrastructure is a cornerstone of investment flowing to IRT locations across the nation.
We continue to track White House data that monitors private investment inflow across the US and have correlated these investment locations to four major transportation systems: airports, roads, major US seaports, and truck stops. Preliminary analysis reveals that the US boasts a highly dense network of airports and roads, suggesting it is unlikely these factors significantly influence companies’ investment decisions. The seaports data maps major US seaports and are defined in Engineering Pamphlet 1130–2–520 as: “(1) Port limits defined by legislative enactments of state, county, or city governments. (2) The corporate limits of a municipality.” The truck stops parking data maps all truck stops across the US. This data was compiled as a result of the Jason Law. The United States Congress established “Jason’s Law” to address the shortage of long-term parking for commercial motor vehicles on the National Highway System, aiming to improve safety for both motorized and non-motorized users. Both transportation networks are essential for facilitating movement of goods and services for B2B and B2C operations.
Figure 1 gives us a snapshot of the seaport and truck locations in the US. As we see, most of these are concentrated across the East Coast, Washington State and California.
To begin the analysis, AlphaGeo utilizes a spatial clustering technique, the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) model using the IRT locations, seaports, and truck stops locations. DBSCAN, a density-based clustering algorithm, groups truck stops and seaports based on their geographic coordinates (latitude and longitude). The clustering was parameterized to identify regions within a specified proximity (~5 km) and a minimum point density. Next, the IRT locations were overlayed to assess spatial relationships, while identifying the hotspots of logistic hubs.
DBSCAN uses two key criteria to classify points into clusters. First, it includes a point in a cluster if it is within a specified distance (eps) of at least one other point in that cluster and belongs to a group that meets the minimum density threshold. Points failing to meet these criteria are labeled as noise, i.e., “noise points.” Noise points often represent low-density areas, such as sparsely populated regions, or isolated locations like truck stops or ports far from dense clusters. They can also include edge cases — points near the boundaries of clusters that do not satisfy the density requirements. This classification is critical to our analysis, given the limitations of the data available.
Figure 2 highlights spatial clusters of truck stops and seaports across the US (excluding the noise). As expected, we see the clusters concentrated along major highways and coastal areas, reflecting logistic hubs and high-traffic regions. Isolated points represent areas with limited or no clustering, reflecting standalone facilities or none.
Figure 3 overlays IRT locations (in red) onto these clusters (excluding noise), showing significant overlap in key logistic hubs. Many IRT sites align with high-density clusters, while we observe some scattered in regions with less logistic activity, indicating potential unique site selection criteria.
To further examine the spatial relationships, we run a correlation analysis on the count of IRT locations, seaports and truck stops in the clusters created through DBSCAN. Figure 4 highlights strong positive relationships between the count of truck stops, ports, and IRT locations within clusters. We observe a near perfect positive correlation between the count of truck stops and the number of seaports (0.95), which is as expected. The correlation analysis confirms our hypothesis that transportation networks impact the flow of investment. We see a correlation of 0.94 between count of seaports and count of IRT locations and a correlation of 0.95 between count of truck stops and count of IRT locations. These results suggest that clusters with a higher density of truck stops tend to also contain a higher number of ports and IRT locations, implying a spatial interdependence between these infrastructures. This high correlation may reflect shared geographical or functional patterns, such as key hubs or transportation networks, or state policies.
The final layer of our analysis incorporates climate risk score data to assess its impact on logistics cluster locations. We employ a Logistic Regression model using cluster-level data, alongside AlphaGeo’s climate risk scores covering fire, heat, drought, and flooding risks.
We create a binary dependent variable (DV) where the variable is assigned the value of 1 for locations within a cluster and 0 otherwise, while the independent variables consist of the average climate risk scores for each cluster (a cluster consists of count IRT locations, seaports, and truck stops). Given the dataset’s significant imbalance — far more instances of locations being within a cluster — we addressed this issue by oversampling the minority class in which DV = 0. This approach involved duplicating samples from the minority class to ensure the model had sufficient data to learn from both classes. After balancing, the data was split into training (80%) and testing (20%) sets. The logistic regression model was then trained to predict the probability of a location being part of a cluster based on climate risk factors. Performance was evaluated using metrics such as precision, recall, F1-score, and accuracy.
The model achieved an overall accuracy of 60%, effectively identifying most locations within clusters (DV = 1) but struggled with non-clustered locations (DV = 0). We see no significant outcomes for clusters where DV = 0, despite oversampling. In contrast, the model performed better when DV = 1, with a precision of 60% and a recall of 100%, indicating strong predictive power for identifying clustered locations but limited ability to distinguish non-clustered ones. These results highlight the need for further refinements, such as incorporating additional features or using alternative modeling techniques, to improve the model’s ability to predict non-clustered locations.
In conclusion, this analysis emphasizes the critical role of transportation networks, particularly truck stops and seaports, in driving investment inflows and shaping logistic clusters. The strong spatial and statistical connections between logistic infrastructure and IRT sites reveal the strategic advantage of aligning investments with dense transportation hubs. Lastly, the analysis of climate risks points to the need for a more resilient infrastructure in high-risk areas to ensure the resilience of investments. As methods and data continue to advance, these patterns are expected to become more rigorous and refined, offering deeper insights into the dynamics of America’s industrial renaissance.