Photo by Simon Berger on Unsplash

Data Transformation in Snowflake with DBT

David Oyegoke
Slalom Data & AI
Published in
3 min readOct 5, 2020

--

This is the sixth and final blog in Slalom’s Snowflake series.

In our last blog, we explored how we can automate data pipelines with Snowflake Streams and Tasks.

In this post, we’ll discuss data transformation in Snowflake with DBT (Data Build Tool).

Snowflake is a purpose-built cloud data platform which has brought a wave of change along with ample opportunities to greatly improve and simplify the cloud data platform world.

The case for DBT (Data Build Tool)

Snowflake simplifies data management and provides core constructs to simplify data transformations. However, the data engineering challenges around the building, maintenance and deployment of data pipelines remain.

While there are many ETL/ELT tools out there which offer a suite of products for this, there is one simpler, open-source option; DBT.

The Data Build Tool (DBT) is a command-line tool based on SQL and Jinja (a templating language) which enables users to transform data within the target database.

DBT with Snowflake

When paired with Snowflake, DBT enables rapid development of optimised ELT data transformation pipelines.

Snowflake features like auto scaling, zero-copy cloning, streams, extensive support for semi-structured data,paired with template-based SQL coding in DBT create a perfect combination for building data transformation pipelines.

While DBT is built primarily for data transformation, some unique features of Snowflake mean it can be used to perform data loading as well.

Why DBT?

Pros:

  • Free and well documented open-source software
  • Easy to set up and use SQL-based tool
  • Increased productivity for data engineers and analysts
  • Increased collaboration and re-use
  • Integrated quality assurance and data quality testing
  • Generates documentation automatically
  • Enforces standards for all data pipelines
  • Generates searchable data lineage and catalogue information

Cons:

  • A GUI exists but is mainly text-oriented
  • Focus is mainly on data transformation within the database

Client Use Case

Increasingly, many organisations are turning to DBT to transform their data within Snowflake, thereby leveraging their investment into the platform.

Recently, one of our clients was migrating from SQL Server to Snowflake and initially chose StreamSets as the ETL tool. During the initial MVP, the client faced challenges around product suite configuration and performance. They also realised the learning curve required for the data engineers who were mostly SQL developers.

The decision was made to look for alternatives and DBT was evaluated due to its ease of use and SQL oriented programming. The team were impressed with the range of features that DBT offers out of the box, such as environment parameterisation, testing, documentation and its seamless integration with Snowflake. As a result, the client was able to rapidly develop and test the data transformations using DBT.

How Slalom can help

Slalom has the experience to help your business make the most of Snowflake and set the stage for long-term growth and sustainability.

Our Snowflake credentials include:

  • Snowflake Partner of the Year for three consecutive years — 2018, 2019 and 2020
  • Over 200 Snowflake certified consultants
  • Over 300 Snowflake projects delivered

We hope you enjoyed this blog series. Reach out to us if you would like to learn more.

David Oyegoke is a Data & Analytics Consultant based in Slalom’s London, UK office. Ashish Billore is a Data & Analytics Solution Architect, also based in Slalom’s London, UK office.

Slalom is a modern consulting firm focused on strategy, technology, and business transformation.

--

--