Working with Geospatial Analytics in Snowflake

Riya Khandelwal
8 min readMar 16, 2023

--

In this article , we will discuss about various Geospatial data formats, Geospatial objects types and we will work with different Geospatial functions available on Snowflake which makes the Geospatial data handling flexible.

Prerequisite — To get best out of this blog check out my previous article which cover about basics of Geospatial data with additional snowflake offerings. Click Here

**** In this article we will use OpenStreetMap New York Dataset available on Snowflake’s Marketplace. So, make the dataset available with you ;) ****

Editorial’s Section

  • Geospatial Data Formats (Input & Output)
  • Geospatial Object Types
  • Geospatial Functions with examples

Geospatial Data Formats

Snowflake’s both data type Geometry and Geography supports the following industry standard data formats for input ( loading data into the GEOGRAPHY data type) and output (result set when queried from a GEOGRAPHY-typed column):

Here, we will be showing the three formats as well as a query output representation of the longitude and latitude of the a generic store — Panda Spa.

  • Well-Known Text (WKT) / Extended WKT (EWKT): The GEOGRAPHY / GEOMETRY results are returned in Text format which are good for human readability.
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP WHERE name = 'Panda Spa';
  • Well-Known Binary (WKB) / Extended WKB (EWKB): The GEOGRAPHY / GEOMETRY results are returned in Binary format which are fast and easiest for data loading.
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKB'; 
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP WHERE name = 'Panda Spa';
  • Geo JSON: The data returned is represented in JSON arrays, which is a popular industry format and easy to Query.
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='GeoJson';
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP WHERE name = 'Panda Spa';

Session parameter — “GEOGRAPHY_OUTPUT_FORMAT” controls how a GEOGRAPHY output will display in Snowflake or in a file after an unload operation.

Geospatial Object Types

The GEOGRAPHY and GEOMETRY data types can store various types of Geospatial objects which can be used for analysing the data further.

In the output representations of the Well-Known Text (WKT) format a particular object type, or geospatial dimension type: a POINT was reflected. There are many different object types, some of which are singular objects, and others which are groupings of multiple objects, as shown in the table below:

Geospatial Objects supported by Snowflake

Geospatial Functions

Geospatial functions helps us to operate on a GEOGRAPHY-typed object, and they make the calculations and analysis over data much more easier. These functions operate on GEOGRAPHY and GEOMETRY and convert Geography and Geometry values to and from other data types(such as VARCHAR).

There are more than 50+ functions introduces by snowflake, to work around the Geospatial data. Let’s take a look at the categories of geospatial functions and a few examples of each, and we’ll follow that with some examples of how to use them in SQL. If you want the full list of functions, click here

Geospatial Functions

These functions can be used different ways:

  • In the SELECT clause
  • In the query predicate
  • In a JOIN clause.

Let’s look at some examples of each usage, with different functions in SELECT operation:

  • TO_GEOGRAPHY: This function accepts a geometric object (WKB/WKT/GeoJson) string expression and returns a value of type Geography, which is a true geospatial data type that can be used with geospatial functions.
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='GeoJson';
SELECT to_geography(coordinates) AS Pharmacy_shop
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';
TO_GEOGRAPHY Function Example
  • TO_GEOMETRY: A conversion function, which accepts a geometric object (WKB/WKT/GeoJson) string expression and returns a value of type Geometry, which is a true geospatial data type that can be used with geospatial functions.
  • ST_ASEWKT: Output conversion function which accepts Geometry or Geography data type and return the text (VARCHAR) representation of that value in EWKT (Extended-Well-Known-Format).
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT to_geometry( ST_ASEWKT(coordinates)) AS Pharmacy_shop
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';
TO_GEOMETRY and ST_ASEWKT Function Example

In the above demo, I have used two functions as coordinates column of the table was of geography type and TO_GEOMETRY function allows only geospatial data objects as its input. To convert the Geography data to EWKT format, that task was done by ST_ASEKWT function.

  • ST_ASTEXT: Output conversion function which accepts Geometry or Geography data type and return the text (VARCHAR) representation of that value in EWKT (Extended-Well-Known-Format).
SELECT ST_ASTEXT(coordinates) AS Pharmacy_shop_as_text
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';
ST_ASTEXT Function Example
  • ST_GEOHASH: Returns geohash of a Geography data.A geohash is a short base32 string that identifies a geodesic rectangle containing a location in the world.
SELECT ST_GEOHASH(coordinates) AS Pharmacy_shop_Geohash
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';
ST_GEOHASH Function Example

ST_GEOHASH returns a geohash that is 20 characters long. The number of characters in a geohash determines precision, more the number better the precision.

For below functions I took the Coordinates of True Care Pharmacy POINT(-73.9794048 40.6846322) and Panda Spa POINT(-73.9235839 40.7677767) using below Query

select Name,Coordinates from OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP 
WHERE name IN ('True Care Pharmacy','Panda Spa') ;
  • ST_MAKELINE: A function which generates a geography or geometry object that represents a line connecting the points in the input parameter. The object must be a Point, MultiPoint, or LineString. In the example we are using Coordinates extracted in the above query.
SELECT st_makeline(to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')) 
as line_between_Spa_and_Pharmacy;
ST_MAKELINE Function Example

ST_POINT: Returns a Geography object that represents a point with the specified longitude and latitude. ST_POINT is an alias for ST_MAKEPOINT. It accepts comma separated Real value of Latitude and Longitude , construct a Geography object

ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT st_point(-73.9794048, 40.6846322);
ST_POINT Function Example
  • ST_STARTPOINT: The function accepts a Geography or geometry type linestring and returns a value of type GEOGRAPHY or GEOMETRY that contains the first Point of the specified LineString.
SELECT st_startpoint(st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')))
as Start_point_of_line_between_Spa_and_Pharmacy;
ST_STARTPOINT Function Example

ST_XMAX, ST_XMIN, ST_YMAX, and ST_YMIN: Returns maximum/minimum latitude/longitude of all points contained in specified Geometry or Geography object. Return a real data data type value. The function takes into account the curvature of the edges toward the poles.

ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT Name as Shop_NAME,
ST_XMAX(coordinates) AS ST_XMAX,
ST_XMIN(coordinates) AS ST_XMIN,
ST_YMAX(coordinates) AS ST_YMAX,
ST_YMIN(coordinates) AS ST_YMIN
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';
ST_XMAX, ST_XMIN, ST_YMAX, and ST_YMIN Functions Example

ST_LENGTH: Returns the geodesic length of the LineString(s) in a GEOGRAPHY object or the Euclidean length of the LineString(s) in a GEOMETRY object.

It returns a REAL value, which represents the length

  • For GEOGRAPHY input values, the length is in meters.
  • For GEOMETRY input values, the length is computed with the same units used to define the input coordinates.
SELECT st_length(
st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')))
as Length_of_path_between_Spa_and_Pharmacy;
ST_LENGTH Function Example

If the input is not a LineString, MultiLineString, or GeometryCollection containing linestrings, ST_LENGTH returns 0

ST_UNION: Returns a GEOGRAPHY object that represents the combined set of all points from both objects (i.e. the union of the two objects),given input as GEOGRAPHY objects.

SELECT ST_UNION(
st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')),to_geography('POINT(40 45)'))
AS UNION_of_Spa_and_Pharmacy;
ST_UNION Function Example

ST_UNION is not guaranteed to produce minimal output.

We have tried covering some of the functions of each type. To get the full list of the functions checkout the snowflake document on this link

Article Code Snippet

This brings us to the end of the article. Find the entire code snippet for this article below:

//************************************** Geospatial Data Formats ************************************** 

// To render data in WKT format
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'Panda Spa';

// To render data in WKB format
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKB';
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'Panda Spa';

// To render data in GeoJson format
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='GeoJson';
SELECT to_geography(coordinates) AS Panda_Spa
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'Panda Spa';


//************************************** Geospatial Functions Usage **************************************

// Using TO_GEOMETRY and ST_ASEWKT Function
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT to_geometry( ST_ASEWKT(coordinates)) AS Pharmacy_shop
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';

// Using TO_GEOGRAPHY function
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='GeoJson';
SELECT to_geography(coordinates) AS Pharmacy_shop
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP WHERE name = 'True Care Pharmacy';

// Using ST_ASTEXT function
SELECT ST_ASTEXT(coordinates) AS Pharmacy_shop_as_text
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';

// Using ST_GEOHASH function
SELECT ST_GEOHASH(coordinates) AS Pharmacy_shop_Geohash
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';

// Extracting coordinates of two shops
select Name,Coordinates from OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name IN ('True Care Pharmacy','Panda Spa') ;

// Using st_makeline function
SELECT st_makeline(to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')) as line_between_Spa_and_Pharmacy;

// Using ST_POINT function
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT st_point(-73.9794048, 40.6846322);

// Using ST_STARTPOINT function
SELECT st_startpoint(st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')))
as Start_point_of_line_between_Spa_and_Pharmacy;

// Using ST_XMAX, ST_XMIN, ST_YMAX, and ST_YMIN functions
ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';
SELECT Name as Shop_NAME,
ST_XMAX(coordinates) AS ST_XMAX,
ST_XMIN(coordinates) AS ST_XMIN,
ST_YMAX(coordinates) AS ST_YMAX,
ST_YMIN(coordinates) AS ST_YMIN
FROM OPENSTREETMAP_NEW_YORK.NEW_YORK.V_OSM_NY_SHOP
WHERE name = 'True Care Pharmacy';

// Using ST_LENGTH functions
SELECT st_length(
st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')))
AS Length_of_path_between_Spa_and_Pharmacy;

// Using ST_UNION functions
SELECT ST_UNION(
st_makeline(
to_geography('POINT(-73.9794048 40.6846322)'), to_geography('POINT(-73.9235839 40.7677767)')),to_geography('POINT(40 45)'))
AS UNION_of_Spa_and_Pharmacy;

Summary

In this article we have covered Geospatial Data Formats (Input & Output) which are entertained by Geospatial functions as an input and output. We talked about various Geospatial Object Types, which can be encountered while working with Geospatial data. We further discussed various Geospatial Functions of different types with detailed example of each function using OpenStreetMap New York Dataset available on Snowflake’s Marketplace.

About Me

I am working as a Cloud Data Engineer at IBM and love to write about snowflake and other cloud offerings.. I have been working on Microsoft Azure Cloud as a part of my job & have knowledge of IBM Cloud, Amazon Web Services . I am Azure Certified Data Engineer, AWS Certified Developer and recently completed Snowflake’s Pro Core Certification .

Follow me on Medium to catch up new articles on various different cloud offerings. Feel free to connect with me on LinkedIn!

--

--

Riya Khandelwal

Data Engineer @ KPMG US | Data & Cloud Enthusiast - Snowflake; Azure | Tech Mantra - "Learn New Things Daily" | Connect - www.linkedin.com/in/riyakhandelwal