StarSnow: HTTP Client for Snowflake SQL

Tamas Foldi
HCLTech-Starschema Blog
5 min readJan 3, 2021

Generic Snowflake functions to interact with web APIs directly from SQL statements.

Night sky over a snowy forest
Photo by Alessandro Viaro on Unsplash

Snowflake is an extremely SQL-friendly database: you can ingest, transform, and access your structured and semi-structured data directly from your SQL code. However, as a cloud-only data platform, it has some fundamental restrictions: all insecure operations like accessing local files or establishing network connections are unsupported from UDFs and stored procedures. There is no build function for calling external APIs or downloading web data from pure SQL. The only exceptions from this constraint are external functions, that allow calling AWS lambda functions, but nothing else.

This is why I built StarSnow, the first general-purpose HTTP client for Snowflake SQL.

Introducing StarSnow

StarSnow allows you to download external web content and invoke complex REST APIs from a simple interface. The package contains two functions, a simplified one for basic get requests and a regular one that allows more tailored requests to be constructed.

Simple HTTP GET

Let’s take a look at a basic use case. We want to check the Snowflake platform’s status information from our ETL processes. To do that, I need to get data from Snowflake’s Statuspage.

First, let’s try it with the simplified get function, called STARSNOW_REQUEST_GET. The first — and only — parameter is the URL: this is where we get the data, and on successful execution, it will return with the contents of that page as string/varchar.

tfoldi@TEST_DB.PUBLIC> select STARSNOW_REQUEST_GET('https://status.snowflake.com/api/v2/status.json') as snowflake_status;+--------------------------------+
| SNOWFLAKE_STATUS
|--------------------------------|
{"page":{"id":"94s7z8vpy1n8","name":"Snowflake","time_zone":"America/Los_Angeles","updated_at":"2020-12-23T17:44:16.380-08:00","url":"https://status.snowflake.com"},"status":{"description":"All Systems Operational","indicator":"none"}}

1 Row(s) produced. Time Elapsed: 3.349s

This was easy, isn’t it?

But what if I want only the status text itself? This is when I need to switch to the regular function, called STARSNOW_REQUEST. Unlike the previous STARSNOW_REQUEST_GET function, this will return with a variant type, thus we can start accessing its elements directly from the select statement:

tfoldi@TEST_DB.PUBLIC> select STARSNOW_REQUEST('https://status.snowflake.com/api/v2/status.json', NULL):data:status:description as snowflake_status;+---------------------------+                                                | SNOWFLAKE_STATUS          |
|---------------------------|
| "All Systems Operational" |
+---------------------------+
1 Row(s) produced. Time Elapsed: 1.739s

This was pretty neat, too. Since the Statuspage API server responded with a JSON object, our function automatically parsed it and made part of the returned variant object. Then, accessing some part of the response was as easy as adding :data:status:description to the SELECT’s field definition to get only that part.

Handling parameters — overcoming Snowflake’s function parameter limitations

And now a little bit of an intermezzo. Life is not full of simple get requests, so we need a fancy way to pass different options and parameters to our function. Now the hard part is, Snowflake does not support optional parameters or named parameter passing. While we can overload functions, that will end up with dozen of functions with all the different permutations, which will make everything pretty inconvenient. So what’s the best practice in Snowflake? The answer is object parameters.

Let’s imagine that we want to pass named parameters to our functions such as method, parameters, body and so on. In some other databases, we can use named parameters but in Snowflake, objects are our best option. A function created with:

create function function_with_many_options(params object) returns float as  [...]

can be invoked as:

select function_with_many_options(
object_construct( 'param1', 'foo'
, 'param2', 3.14
, 'param10', current_timestamp(2)));

While we have only one parameter for our function, invokers can pass any number of options inside that single parameter as key-value pairs.

Thus, the second parameter of our STARSNOW_REQUEST is an object, that maps perfectly to an Axios request configuration object, allowing us to define dynamically request method, query parameters, request body, proxy server, and so on.

Ingesting Snowflake stock prices directly from a REST API

Let’s look at another use case, where we want to ingest data directly from a REST API to a Snowflake Table. Previously, we had to download the data from the API, stage it in Snowflake, then copy it to the final table. For large data volumes, it makes sense, but it is a fairly big overhead for small data.

For getting the stock prices, I am using Alpha Vantage API. It is completely free, you can get an API key here.

Now we can start building our query. The URL we need to invoke is:

https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=SNOW&apikey=apikey

We can start constructing our options parameter and add the necessary flattening code to transform the returned JSON/variant to column, then persist to a real table.

Executed directly from SnowSight, we can immediately see the results:

Nice, default chart from the freshly ingested data in SnowSight (image by the author)

Wow, we just invoked, transformed and ingested REST API data to Snowflake with a single SQL statement!

Snowflake and Slack Integration

The last example is about HTTP POST requests with a lively use case: sending a Slack message from Snowflake. We use this feature for exception handling. For example, if some of the data pipeline fails, Snowflake can inform us on one of our ChatOps channels.

To start, create a new Webhook Integration, following the steps here.

Then execute the following statement in your Snowflake instance:

select STARSNOW_REQUEST('https://hooks.slack.com/<your_webhook_id>',
OBJECT_CONSTRUCT('method', 'post',
'data', OBJECT_CONSTRUCT(
'text', 'hello from snowflake')));

And this is how your friendly snowflake bot says hi to you on slack:

Instead of just saying ‘hi’, you can also send the result of a select statement.

Deploying StarSnow

For deployment, I used Serverless along with my Serverless Snowflake plugin which automates every step of the installation.

First, you need to install Serverless:

npm install -g serverless

Then download and compile StarSnow sources:

git clone https://github.com/starschema/starsnow_request
npm install

Set up the following environment variables:

  • SNOWFLAKE_ACCOUNT
  • SNOWFLAKE_USERNAME
  • SNOWFLAKE_DATABASE
  • SNOWFLAKE_PASSWORD
  • SNOWFLAKE_ROLE (make sure that the role has CREATE INTEGRATION privilege)
  • SNOWFLAKE_SCHEMA
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

And finally, start deploying the lambda functions and external functions with:

serverless deploy

At this point, you should have everything set up in your environment.

Conclusion

While Snowflake doesn’t have an HTTP Client built-in, with StarSnow you can still interact with REST APIs. Web scraping, ingesting results from external web services or sending Teams and Slack messages are now possible from pure Snowflake SQL. If you need any help to set it up, just reach out to me on LinkedIn or Twitter.

Feel free to use or contribute to StarSnow at its GitHub project page: https://github.com/starschema/starsnow_request

Update: StarSnow is now fully supported in Snowpark, making it easy to use with Python too.

--

--

Tamas Foldi
HCLTech-Starschema Blog

Helping enterprises to become more data driven @ HCLTech, co-founder & former CEO @ Starschema