3 Main Elements in Your Snowflake Bill
We use Snowflake as a managed data warehouse at Wise to serve primarily analytical use cases. This analytics data warehouse provides the compute, data storage, transformation and modelling layer to serve an extensive collection of business data which helps us make data-driven decisions and provide meaningful business insights.
Like everything in today’s world, Snowflake comes at a cost, and for us, Wisers, it is essential to get a better understanding of the billing model, so we can identify levers to be more cost-efficient and stay aligned to our company mission.
When I looked closer, I struggled to find a summary of the main cost-contributing elements of Snowflake; the only starting point was Snowflake’s thorough and lengthy documentation. In this blog, I’m going to summarise my findings, hoping others find them helpful.
Snowflake cost is based on three elements; data storage, compute resources and data transfer. This blog will focus on these elements in more detail.
Snowflake charges a flat rate for a terabyte (TB) of data per month. This rate depends on the type of Snowflake storage account (On-Demand or Capacity) and the cloud provider’s region where Snowflake is hosted. This flat rate also includes the historical data maintained for Time Travel and Fail-Safe, which are features for ensuring the maintenance and availability of historical data.
Data Storage is roughly 10% of our total Snowflake bill
Compute resource uses a different unit of measure for billing called Credits. Credits are billed per second, with a 1-minute minimum — e.g. when a warehouse is created, started or resized, a 1-minute credit is billed, and after, credits are billed per second.
Credits are billed only when resources are used, such as when a virtual warehouse is running, the cloud services layer performs an activity, or serverless features are used.
Compute Resources accounts for ~90% of our total Snowflake bill
A Virtual Warehouse is a compute resource where queries are executed, processing power is used to load data, and Data Manipulation Language (DML) operations are performed.
The number of credits consumed depends on the number of virtual warehouses, their size, how long they run and the compute resources used to load data into them.
Snowflake Virtual Warehouses come in ten sizes from X-Small to 6X-Large, starting from 1 Credit/Hour, doubling Credits/Hour for each increment to the next larger Warehouse.
At Wise, we predominately use a mix of X-Small (1 Credit/Hour) and Small (2 Credits/Hour) warehouses.
Warehouses are only billed for credit usage while running. By default, our Warehouses go to sleep (Suspended) after a period of inactivity (2 minutes in most Warehouses), which means they do not use any credits when suspended. Also, to reduce credit usage, we have scheduled tasks to downsize some warehouses during known quiet periods (e.g. weekends and nights).
The cloud services layer is a set of services that orchestrate activities across a Snowflake environment. These services run on compute instances provisioned by Snowflake from the cloud provider; some of these cloud services contributing to the use of compute resources are:
- Infrastructure management
- Metadata management
- Query parsing and optimisation
- Access control
As virtual warehouse usage, Credits are used for billing usage of the cloud services that exceed 10% of the daily compute consumption.
When cloud services usage doesn’t exceed 10% of the daily compute consumption, no Credits are billed for the cloud services used.
These features rely on compute resources provided by Snowflake (aka serverless compute model). Snowflake automatically maintains, resizes and scales up or down these resources as required for each workload.
Charges for these features are calculated based on the total usage of the resources (including cloud service usage) measured in compute-hours. One compute-hour is comparable to the computing resources utilised when running an X-Small virtual warehouse for an hour.
There are two types of Data Transfer costs; Data into Snowflake (data ingress) and Data out of Snowflake (data egress).
Data ingress is billed by the cloud provider, not Snowflake. Therefore, we will not cover this cost attribution in this blog, but it needs to be considered in your total cost.
There can be charges from Snowflake and the cloud provider for data egress.
Snowflake currently applies data egress charges only in the following cases:
- Unloading Data from Snowflake: Using the SQL command
COPY INTO <location>to unload data to cloud storage in a region different from where our Snowflake account is hosted.
- Database Replication: Replicating data to a Snowflake account in a region different from where our Snowflake account is hosted.
- External Functions: Data sent via API Gateway Private Endpoints incurs PrivateLink charges.
The amount charged per byte depends on the region and the cloud provider where Snowflake is hosted.
On the cloud provider side, data egress charges apply in either of the following cases:
- Data is transferred from one region to another
- Data is transferred out of the cloud provider
The cloud provider bills data ingress, and we currently do not have requirements for any of the data egress cases mentioned above.
Understanding Snowflake billing could be challenging depending on your consumption patterns and setup. We always pay close attention to these cost-contributing elements to identify opportunities to get the best blend of cost and performance.
Hopefully, this blog has given you enough easy-to-digest information to better understand Snowflake billing model.