A Snowflake Collaborative Logistics Solution for Retailers

Date: July 2023
Opinions expressed in this post are solely my own and do not represent the views or opinions of my employer.

Photo by Chris Dickens on Unsplash

When looking at supply chain logistics for retailers, we are seeing a few key themes:

  1. After the pandemic, online shopping has increased drastically. This increase in demand requires retail organizations to optimize their logistics around customer deliveries.
  2. The most expensive and time-consuming part of the shipping process, known as the “last mile delivery problem”, is getting only costlier and more inefficient in especially fast-growing cities. Last mile problem comprises around 53% of overall cost of shipping. (Source: https://www.insiderintelligence.com/insights/last-mile-delivery-shipping-explained/) Furthermore, last mile delivery also plays a critical role in the customer’s overall satisfaction. High traffic congestion, inefficient routing, and complex delivery schedules can lead to delays and customer dissatisfaction. Please read more here: https://carto.com/blog/last-mile-transportation-route-optimization.
  3. Increasing online deliveries and suboptimal logistics negatively affect urban living contributing largely to traffic congestion, air pollution and other transportation issues in cities. Many cities are exploring smart city initiatives to find creative ways to improve the negative effects of traffic congestion.

In addition to several newer initiatives to address these issues, such as location-based crowdsourcing and smart mobility, a potential Snowflake collaborative logistics solution enables retailers to consolidate deliveries (without sharing PII) and leverage a cutting-edge vehicle route optimization service from Carto, a cloud-native location intelligence platform, natively running in Snowflake.

Collaborative Logistics Data Clean Room

Let’s take a closer look at the key components of the solution:

Snowflake Data Clean Rooms with Geospatial Functions:

Using a Snowflake data clean room, daily customer deliveries with geocoordinates that are stored in each retailer’s Snowflake account can be combined via quries without sharing any PII or customer addresses by using geospatial functions like ST_DWITHIN (within 5 miles or 8047 meters, in our example) in the data clean room query as shown below:

with base as (
select
p1.party_id as p1_party_id
,try_to_geography(p1.geom) as p1_target_geom
,p1.order_number as p1_order_number
,p2.party_id as p2_party_id
,try_to_geography(p2.geom) as p2_target_geom
,p2.order_number as p2_order_number
,try_to_geography(ndl.geom) as ndl_geom
from p1_orders as p1 ,p2_orders as p2 ,non_delivery_locations as ndl
where ST_DWITHIN(p1_target_geom ,p2_target_geom, 5 * 1609 ) = true
and ST_CONTAINS(ndl_geom ,p1_target_geom) = false
and ST_CONTAINS(ndl_geom ,p2_target_geom) = false
)
(select p1_party_id as party_id ,p1_target_geom as delivery_location ,p1_order_number as order_number
from base
)
union
(select p2_party_id as party_id , p2_target_geom as delivery_location ,p2_order_number
from base
)

After this, we used two advanced geospatial functions that are available in the Carto Analytics Toolbox (A new Native App in the Snowflake Marketplace!). Combined deliveries in the data clean room consumer account are clustered as dynamic groups (we specified 5 clusters) using ST_CLUSTERKMEANS function and then converted into polygons using ST_CONCAVEHULL function. A similar query can be constructed as below:

with clustered_points as (
select
array_agg( st_asgeojson(delivery_location)::string ) as agg
,carto.carto.st_clusterkmeans(agg, 5) as cluster_arr
from p1_p2_orders
)
select
f.value:cluster::int as cluster_id
,array_agg(f.value:geom::variant) as geomarray
,carto.carto.st_concavehull(geomarray) as cluster_polygon
from clustered_points as b
,lateral flatten(input => b.cluster_arr) as f
group by cluster_id

The polygons are used in visualizing the 5 clusters. In our example, we are using City of Austin, which is one of the fast-growth cities in the United States as shown in Carto’s map visualization below:

Carto Builder Map Visualization of clusters

Retrieval of Near Real-time Traffic Accident Data Using External Access (in Private Preview as of July 2023):

To make a call to a public endpoint (Austin Real-time Traffic Incident Reports Web Service, in our case), we use the brand-new, easy External Access feature in Snowflake by creating a NETWORK RULE and an EXTERNAL ACCESS INTEGRATION construct that is used in a User-defined function as shown below:

create or replace network rule austin_api_network_rule
mode = egress
type = host_port
value_list = ('data.austintexas.gov')
comment = 'https://data.austintexas.gov/resource/dx9v-zd7x.json?$order=traffic_report_status_date_time%20DESC';

create or replace external access integration austin_traffic_api
allowed_network_rules = (austin_api_network_rule)
enabled = true;
create or replace function MOBILITY.PUBLIC.fetch_traffic()
returns variant
language python
runtime_version = 3.8
handler = 'main'
external_access_integrations = (austin_traffic_api)
packages = ('requests')
AS
$$
import _snowflake
import requests
import json
def main():
url = 'https://data.austintexas.gov/resource/dx9v-zd7x.json?$order=traffic_report_status_date_time%20DESC'
response = requests.get(url)

# Note: try/catch best practices here:)
js = response.json()
return js

With this UDF, we can notify drivers about recent traffic incidents in a timely manner.

Also see the related post on the new Direct Access feature here: https://medium.com/snowflake/loading-data-from-github-or-public-website-to-snowflake-331009d378c2

Carto’s Vehicle Route Planning Service Running in Snowpark Container Services (in Private Preview as of July 2023):

Leaving the best for the last, we used the brand new Snowpark Container Services in this solution to run Carto’s Vehicle Route Planning (VRP) service natively in Snowflake to recommend routes for the deliveries in each cluster. Carto’s VRP combines OSRM, OR tools and proprietary IP to calculate routes with many stops and vehicles at scale.

If you came this far:), I highly recommend reading this excellent Tech Primer on the brand new Snowpark Container Services: https://medium.com/snowflake/snowpark-container-services-a-tech-primer-99ff2ca8e741

After uploading Carto’s VRP Docker image to our repository in Snowflake, VRP service executes as a job specifying a CPU based compute pool and the job specification yaml file as shown below as below:

execute service 
compute_pool = fleet_optimize_compute_pool
spec = @fleet_optimiz.data_schema.yaml_stg/vrp_sp/spec.yaml;

Compared to other container services option, running Carto’s VRP image in Snowpark Container Services was a lot more easier experience!

As shown below, another map visualization from Carto delineates optimized routes for one of the delivery clusters in the City of Austin. As shown in the URL, the Carto Web UI can also run in Snowpark Container Services as a long running service. (And that is another blog post:)

Carto’s map visualization running in Snowpark Container Services PrPr

Special thanks to Venkat Sekar for his valuable contributions developing this solution.

--

--

Eda Johnson
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

NVIDIA | AWS Machine Learning Specialty | Azure | Databricks | GCP | Snowflake Advanced Architect | Terraform certified Principal Product Architect