Streamlit in Snowflake: How to Visualise Geospatial Shapes

Saša Mitrović
7 min readFeb 7, 2024

With Streamlit on Snowflake, organisations build, run and share interactive, data-heavy applications directly on their respective Snowflake accounts. It is now so easy to know your data that there’s no more excuses if you’re not doing it. In this post, I’m showing how to do advanced visualisation of geospatial data on Streamlit.

· Why do Organisation need Visualising Geospatial Data?
· Visualising GeoJSON Layers using Pydeck in Streamlit
· Visualize Polygon Data from a Snowflake table using Pydeck in Streamlit
· Summary

Why do Organisation need Visualising Geospatial Data?

Admittedly, I’m a fan of printed maps, digital maps, location intelligence and all things geospatial, having worked for one of the leading companies in the space — Here Technologies (formerly Navteq).

But, why do companies need geospatial data analysis? Let’s look at some examples.

  1. Retail businesses rely on location intelligence to choose optimal locations for new stores, for example. Geospatial analysis helps in visualising demographic data, foot traffic patterns and competitor locations.
    By overlaying this information on maps, businesses identify prime locations for their target customer base, ensuring better market penetration and profitability.
  2. During emergencies such as natural disasters, timely and informed decision-making is critical for effective response and resource allocation. Geospatial data is essential for visualising the affected areas, identifying vulnerable populations, and coordinating emergency response efforts.
    By overlaying data on a map, emergency response organisations can assess the situation, plan evacuation routes, and allocate resources strategically, improving overall disaster management.
  3. Core logistics and supply chain management tasks are optimising routes, minimising transportation costs and ensuring timely deliveries.
    Geospatial analysis enables businesses to visualise transportation networks, identify bottlenecks and optimise routes. Real-time tracking of shipments using geospatial data helps in monitoring the movement of goods, predicting delivery times and responding timely to disruptions in the supply chain.

Visualising GeoJSON Layers using Pydeck in Streamlit

GeoJSON is a simple, versatile and popular format for geospatial data. Here’s an example GeoJSON data set:

https://raw.githubusercontent.com/visgl/deck.gl-data/master/examples/geojson/vancouver-blocks.json

This is a complex dataset which holds hundreds of polygons, each one describing the territory of one Vancouver city block.

As you know, Snowflake handles JSON data format natively and here’s this GeoJSON’s visual representation in Streamlit in Snowflake using st.pydeck_chart:

Visualisation of GeoJSON data in Streamlit in Snowflake

Let’s create this visualisation from scratch in Streamlit in Snowflake (SiS). There are only few steps required to do this:

  1. Create a Streamlit app in the Snowflake UI
  2. Make the GeoJSON file available to the app though a Snowflake stage
  3. Define the GeoJSON Pydeck layer passing in the GeoJSON data
  4. Visualise the map using this layer with st.pydeck_chart()

1. Creating a new Streamlit app in the Snowflake UI (Snowsight) is easy:

Creating a Streamlit app in Snowsight

With a few clisk a new app will be created containing sample code that showcases some of the cool Streamlit features. On the left side we can directly edit the app code.

2. Make the GeoJSON file available to the app though a Snowflake stage

To visualize the GeoJSON layer with Pydeck, we will use a GeoJSON file directly from a Snowflake stage without actually ingesting it into a table. How does that work?

Before we edit the newly created Streamlit app, let’s make the GeoJSON file available to the app by uploading it to a Snowflake stage. This can be done from Snowsight, here’s the documentation on that: https://docs.snowflake.com/en/user-guide/data-load-local-file-system-stage-ui#upload-files-onto-a-named-internal-stage

And here’s how that looks like:

Staging a GeoJSON File to Snowflake

3. Define the GeoJSON Pydeck layer passing in the GeoJSON data

Finally, let’s do some coding:

Here I’m fetching the file from the stage, opening it with the standard Python os package and parsing with the JSON Python package. All the packages I need for this app are available to the Streamlit app via a dedicated and secure Snowflake package repository, managed by Anaconda.

Finally, I’m creating the Pydeck GeoJson Layer.

4. Visualise the map using this layer with st.pydeck_chart()

Now that we have the GeoJSON layer, let’s create the initial view state i.e. center the map over Vancouver, create the Pydeck object and visualize it using Streamlit:

That’s all you need to visualize a GeoJSON using Streamlit in Snowflake.

You can find the complete source code here and it includes an extra rectangular polygon layer, just like in the screenshot bellow: https://github.com/sashamitrovich/Streamlit-Snowflake/blob/main/visualize-GeoJSON/SiS-GeoJSON.py

Visualize Polygon Data from a Snowflake table using Pydeck in Streamlit

In the previous example, I’ve shown how to visualize GeoJSON directly from a file on the stage. Usually, the geospatial data is already ingested and avaible in Snowflake. So let’s now use a Snowflake database as a source for the data.

I don’t have any data handy so I will just get a free dataset from the Snowflake Marketplace, like this one: https://app.snowflake.com/marketplace/listing/GZSTZO0V7VR/resilinc-eventwatch-ai

This product contains complex Geospatial data about regions affected by supply chain disruptions

When you click on the “Get’ button, this dataset becomes available on your Snowflake account, without the need to copy or ingest or transform any data — you can use it immediately.

This dataset is contains a sample of different supply chain disruption events like severe weather, regulatory changes, merger & aquisituions…really anything that can have an impact on a business dependent on a reliable supply chain.

The dataset also includes factory and company sites affected by the disruption. The cherry on the top — all this information is supplemented by location data. The affected regions are represented as polygons and affected sites are also accompanied with their geo-coordinates.

And that is exactly what we’re going to visualize in Streamlit. Here are the steps

  1. Prep data for Pydeck PolygonLayer to display affected region
  2. Create the Pydeck PolygonLayer
  3. Create a Heatmap to display affected sites as a heat map
  4. Create the Pydeck visualization from the layers using Streamlit

This is how the final result will look like:

Visualizing Polygon and Heatmap Data with Pydeck in Streamlit

Let’s do build this step-by-step.

1. Prep data for Pydeck PolygonLayer to display affected region

Let’s get the data in the shape that Pydeck requires:

I’m using Snowpark to transform the data. To learn more about it, take a look at this tutorial: Getting Started With Snowflake for Python and Streamlit

The df_event dataframe holds all rows referencing this Event ID. I’ve chosen an event that comes with geospatial data. There are other events that don’t include geospatial data as it depends on the event type in this specific dataset.

To get the sites data, I create a df_event_sites dataframe — it includes basic site information and its geo-coordinates.

To get the coordinates of the affected region as a polygon, I’m taking the first row of the df_event and creating a Pandas dataframe called df_region_pandas. Why Pandas? Pydeck doesn’t work with Snowpark data frames, yet, so we need to supply a Pandas dataframe as data source.

Finally, I’ll print the contents of this dataframe just to display the final result. With this, we have the data ready to create the Pydeck layers.

2. Create the Pydeck PolygonLayer

Now that we have the data in the proper format, let’s iterrate through the df_region_pandas and process each row:

Creating an array of PolygonLayer objects for Pydeck

An event can cause disruptions in multiple regions, so we will assume the dataframe has multiple rows — each holding a polygon data.

Additionally, we’re calculating the center point so we position the map over the polygon accordingly.

3. Create a Heatmap to display affected sites as a heat map

Let’s now create the Heatmap layer with all the site coordinates, using the data we prepared before:

Note I’m droping rows with empty values, just in case data is not perfect (it never is) in order not to confuse Pydeck.

4. Create the Pydeck visualization from the layers using Streamlit

Finally, let’s put this together and visualize with st.pydeck_chart:

Note that I am passing an array of layers in the my_polygons variable plus a single layer called sites_layer when creating the Deck object.

That’s all there is to it!

Summary

In this post, I’ve explained the importance of being able to visualize geospatial data and shown examples of visualizing data coming in different formats: as a GeoJSON file and as geospatial data avaiable in a Snowflake table. Let me know what you think and if you have any comments. Follow me for more content like this.

Disclaimer:

Views and opinions expressed in this article are my own and do not represent that of my place of work. I expressly disclaim any liability or loss incurred by any person who acts on the information, ideas or strategies discussed in my stories on Medium.com. While I make every effort to ensure that the information I’m sharing is accurate, I welcome any comments, suggestions, or correction of errors.

--

--