Snowflake operational reporting for optimal consumption

Aakaash Ramnath Sankar
Cervello, a Kearney Company
6 min readJan 21, 2021

In the era of cloud computing, we tend to use promising scalable warehouses to process, persist, and analyze data. Pay-as-you-go models from cloud vendors have allowed companies to dip their toes into modern data architecture without making major investments. But as the volume of data produced and consumed continues to grow, the demand for computing speed also grows. Keeping track of storage utilization, operating costs, and infrastructure budgets are especially challenging when consumption is skyrocketing. If we manage to do so, we can analyze, derive patterns, and predict to optimize costs without compromising on the computing speed or the quality.

Staying true to our mantra of “Win With Data,” Cervello’s operational reporting solution leverages Snowflake’s operational usage data to provide insights about consumption patterns and provides a robust mechanism to optimize our Snowflake costs. In a series of three blogs, I’ll walk you through a high-level overview of our solution, share a technical overview of automating data pipelines across multiple accounts, and discuss business intelligence (BI) reports and advanced analytics solutions that enable us to plan for and optimize our Snowflake consumption.

With Snowflake operational reporting, we can monitor the use of multiple accounts from one centralized Snowflake account (mostly from an organization account). One scenario is having a large client base and many streams that we handle for them. We end up having separate Snowflake accounts for every use case to distinguish the billing and the budget. Our solution enables you to view, analyze, and report the usages — all from one place. Snowflake tracks credits usage, queries history, and computation costs based on the data operations that are performed under an account. The information under the Snowflake database is granular and detailed, but the information is specific to an account and is not directly accessible outside that account. With our solution, we use information that is readily available from Snowflake and the Snowflake share feature to centralize the usage data. Once we streamline the usage data from the associate accounts into a master account, a combined dataset is persisted into a database with views built on top of the data to enable reporting and insights. Also, persisting the historical data allows us to do more accurate predictive modeling in the future. Our solution addresses one of the key questions of modern business expectations: Can I visualize the account usage in nearly real-time? Yes, the views are consumable by the BI tools and accessible to build reports, charts, dashboards, and alerts from there.

We faced similar challenges as we grew our Snowflake usage as part of our Solutions Factory, especially as we built our Snowflake architecture to be a multi-region, multi-cloud solution using Snowflake’s organization feature and solved using operational reporting. It was challenging to keep track of our costs and usage when the platform keeps growing. Now, we have operations data from multiple accounts in a centralized database, and from there, analytical dashboards and reports are developed for the business to track and optimize costs.

Pic-100: High-level architecture diagram

Associate accounts are the Snowflake accounts dedicated to some particular use cases falling under a client platform. In most cases, the master account is an organizational account, but it can be any other account as well.

1. Prepare the account usage data in associate accounts to share with the master account.

2. Share the account usage data from all the associate accounts to the master account.

3. Once the data is shared, we run scripts to channelize the data from associate accounts to a centralized database.

4. BI-specific views are built in the centralized database and placed under a separate schema that a reporting tool can access.

Key features of operational reporting

Pic-101: Key features

· Automation : The data pipeline is automated with Snowflake’s task feature, which is used to schedule and orchestrate the data refresh across accounts.

· Near real-time reporting : This allows us to build a reporting layer on top of this to visualize, analyze, and take actions depending on the usage.

· Separation of cost : Any queries that run related to the solution’s deployment and data refresh will not be charged toward existing warehouses or roles. The solution creates a warehouse role for itself and an extra role for the reporting tool to access the data. We can easily separate the cost of operational reporting usage.

· Customizable : Final BI-specific views can be customized and dimension tables can be created to support your usage data for reporting purposes.

The reporting feature can also be used for predictive analysis or setting up alert mechanism based on account usage data as needed. See below for some sample reports built out of the operational reporting solution that highlight insights from the data.

Pic-102: Warehouse credits by project

Every project has a dedicated Snowflake account (Proj A, Proj B, Proj C). We have warehouses across accounts. The above visual shows the credit usage distribution across Snowflake accounts and the warehouses under it.

Pic-103: Credits usage trend & forecast

The above visual shows monthly credits usage and the credit usage forecast for the next six months based on the historical usage.

I’ve gone through how the Snowflake account usage feature can be used to derive a channelized report for multiple Snowflake accounts. Targeted toward small to big engagements dealing with one or multiple Snowflake accounts, this would add more value to the project as this allows clients to have an overall picture of their bill distribution and helps build a transparent relationship with clients.

In my next two blogs, I’ll share a technical overview to automate the data pipelines across accounts and BI reports as well as the advanced analytics solutions that enable us to plan for and optimize our Snowflake consumption. Until then, feel free to reach out with any questions and share your thoughts on our Snowflake operational reporting solution.

About Cervello, a Kearney company

Cervello, is a data and analytics consulting firm and part of Kearney, a leading global management consulting firm. We help our leading clients win by offering unique expertise in data and analytics, and in the challenges associated with connecting data. We focus on performance management, customer and supplier relationships, and data monetization and products, serving functions from sales to finance. We are a Solution Partner of Snowflake due to its unique architecture. Find out more at Cervello.com.

About Snowflake

Snowflake delivers the Data Cloud — a global network where thousands of organizations mobilize data with near-unlimited scale, concurrency, and performance. Inside the Data Cloud, organizations unite their siloed data, easily discover and securely share governed data, and execute diverse analytic workloads. Wherever data or users live, Snowflake delivers a single and seamless experience across multiple public clouds. Join Snowflake customers, partners, and data providers already taking their businesses to new frontiers in the Data Cloud. snowflake.com.

--

--