Optimizing Credit Usage: Effective Ways to Obtain Storage Details for a Specific Snowflake Database

One of my team members recently requested with calculating the database credits for a project in Snowflake. To calculate the storage space used by a specific Snowflake database, we first need to understand the base metadata view that provides the necessary details.

Database_storage_usage_history (View) -> Details about database and Failsfe

Stage_storage_usage_history (View) -> Details about stages

Before diving into the calculation process or SQL queries, it’s important to know that Snowflake uses a usage-based pricing model, meaning that the credits charged depend on the amount of storage used and the duration of time that the data is stored.

Snowflake’s pricing model is based on the daily average volume of data stored in its system, measured in bytes. This includes compressed or uncompressed files that are staged for bulk loading or unloading, historical data stored in File-safe, and data maintained in database tables.

Snowflake automatically compresses all table data to optimize and reduce storage. It calculates the amount of storage used by an account based on the size of the compressed data.

DATABASE_STORAGE_USAGE_HISTORY View:

This Account Usage view can be used to query the average daily storage usage, in bytes, for databases in the account for the last 365 days (1 year). The data includes:

  • All data stored in tables in the database(s).
  • All historical data maintained in Fail-safe for the database(s).

Columns and Description about Database_storage_usage_History_View:

Figure 1 Database_Storage_Usage_History View

STAGE_STORAGE_USAGE_HISTORY View:

This Account Usage view can be used to query the average daily data storage usage, in bytes, within the last 365 days (1 year) for all the Snowflake internal stages in the account, including:

  • Named internal stages.
  • Default staging areas (for tables and users).

Columns and Description about Stage_storage_usage_History_View

Figure 2 Stage_Storage_Usage_History VIew

Note: Both view returns stage storage usage within the last 365 days (1 year).

SQL Query:

SELECT CONVERT_TIMEZONE('UTC', USAGE_DATE) AS USAGE_DATE
, DATABASE_NAME AS OBJECT_NAME
, 'DATABASE' AS OBJECT_TYPE
, MAX(AVERAGE_DATABASE_BYTES) AS DATABASE_BYTES
, MAX(AVERAGE_FAILSAFE_BYTES) AS FAILSAFE_BYTES
, 0 AS STAGE_BYTES
FROM SNOWFLAKE.ACCOUNT_USAGE.DATABASE_STORAGE_USAGE_HISTORY
WHERE USAGE_DATE >= DATE_TRUNC('DAY', ('2023-02-01')::TIMESTAMP_NTZ)
AND USAGE_DATE < DATE_TRUNC('DAY', ('2023-02-28')::TIMESTAMP_NTZ)
GROUP BY 1, 2, 3

UNION ALL SELECT CONVERT_TIMEZONE('UTC', USAGE_DATE) AS USAGE_DATE
, 'STAGES' AS OBJECT_NAME
, 'STAGE' AS OBJECT_TYPE
, 0 AS DATABASE_BYTES
, 0 AS FAILSAFE_BYTES
, MAX(AVERAGE_STAGE_BYTES) AS STAGE_BYTES
FROM SNOWFLAKE.ACCOUNT_USAGE.STAGE_STORAGE_USAGE_HISTORY
WHERE USAGE_DATE >= DATE_TRUNC('DAY', ('2023-02-01')::TIMESTAMP_NTZ)
AND USAGE_DATE < DATE_TRUNC('DAY', ('2023-02-28')::TIMESTAMP_NTZ)
GROUP BY 1, 2, 3;

The SQL query provided as a sample retrieves the daily usage details of databases, Failsafe, and stage storage for a specific time period from 1st February 2023 till 28th February 2023. A sample screenshot of the query results is shown below.

Figure 3 Sample Query Result

To calculate the database storage for credit calculation, we need to find the average size per day for the month. This means that if the total storage used for the month was 10GB, 20GB, 30GB, and 40GB (taken over weeks to simplify the example), you would pay for 25GB (which is the average of these values).

Once we have determined the monthly average storage size in bytes, we can convert it into credits using a formula that will depend on the specifics of your account contract.

In the next blog post, we will discuss how to calculate the database storage size using table functions and Snowsight.

References:-

About me:

I am a Data Engineer and Cloud Architect with experience as a Senior Consultant at EY GDS. Throughout my career, I have worked on numerous projects involving legacy data warehouses, big data implementations, cloud platforms, and migrations. If you require assistance with certification, data solutions, or implementations, please feel free to connect with me on LinkedIn.

--

--