Transform Snowflake Data Marketplace data using dbt

dbt enables data analysts and engineers to transform data by simply writing SQL select statements. It comes with a software engineering-influenced framework that enables the ability to easily test, document, and version control your analytics project. dbt compiles code into raw SQL queries that run against your Snowflake cluster to use existing computing resources. It also understands dependencies between your queries and can run them based on dependencies. dbt code is a combination of SQL and Jinja (a pythonic templating language), enabling you to write if statements, loops, and macros in your SQL queries.

Snowflake Data Marketplace allows customers to discover and securely access open and commercial data sets across 16+ categories including SaaS providers to improve business analytics. Data that is accessed via the Data Marketplace is available in the customer’s own Snowflake account, without having to copy and move, thus speeding up time to value.

Currency conversion is one of the most common topics from dbt community discussion, in this post, we will show how to subscribe to live Snowflake Data Marketplace banking and foreign exchange datasets securely, perform the currency conversion using dbt.

Solution overview

To implement this solution, you complete the following high-level steps:

  1. Subscribe to Snowflake Data Marketplace datasets.
  2. Configure your Snowflake Data Cloud environment.
  3. Set up your dbt Cloud prerequisite.
  4. Set up your dbt Cloud development environment.
  5. Build the currency conversion project.
  6. Validation.

The following diagram illustrates the solution process flow.

The solution will be leveraging the following Snowflake Data Marketplace datasets:

Knoema banking data set in this solution is the Balance Sheet of Monetary Financial Institutions excluding the Central Bank of United Kingdom, what we are going to do is to convert the monthly balance sheet in Pound sterling to United States Dollar (USD) using the EDI foreign exchange rates month-end conversion rate.

Prerequisites

Before starting this walkthrough, you must have the following:

Subscribe to Snowflake Data Marketplace datasets

To subscribe to the Snowflake Data Marketplace datasets, complete the following steps:

  1. Login to your Snowflake account.
  2. Choose Data Marketplace.
  3. Choose Explore the Snowflake Data Marketplace.
  4. Enter Knoema Banking Data Atlas at the search bar.
  5. Choose the first result, ie Banking Data Atlas.
  6. Choose Get Data from the top right.

7. Leave default value for Database name.

8. For roles, choose SYSADMIN.

9. Choose Create Database.

10. Choose Marketplace.

11. Enter EDI Foreign Exchange Rates at the search bar.

12. Choose the first result, ie EDI Foreign Exchange Rates.

13. Choose Get Data.

14. Leave default value for Database name.

15. For roles, choose SYSADMIN.

16. Choose Create Database.

Configuring your Snowflake environment

To configure the Snowflake environment, complete the following steps:

  1. Login to your Snowflake account.
  2. Execute the following code to create database dbt and warehouse dbt_wh.

use role sysadmin;

create database dbt;

create or replace warehouse dbt_wh

WAREHOUSE_SIZE = XSMALL

INITIALLY_SUSPENDED = TRUE

AUTO_SUSPEND = 60;

3. Execute the following code to create role dbt_admin and grant relevant privileges.

use role accountadmin;

create role if not exists dbt_admin comment = “Database administration”;

grant usage, operate on warehouse dbt_wh TO role dbt_admin;

grant imported privileges on database FAA22172_FX_SAMPLE to role dbt_admin;

grant imported privileges on database KNOEMA_BANKING_DATA_ATLAS to role dbt_admin;

grant usage on database dbt to role dbt_admin;

grant create schema on database dbt to role dbt_admin;

grant role dbt_admin to role sysadmin;

4. Execute the following code to create user dbt and grant user to the dbt_admin role. Please replace the password below with a secure password, note that you need this password at a later step when configuring dbt cloud project connection.

set pwd = ‘<password>’;

create user if not exists dbt password=$pwd comment=’For DBT blog’;

alter user dbt set default_role=dbt_admin, default_warehouse = dbt_wh;

grant role dbt_admin to user dbt;

5. Execute the following code to validate role, user and warehouse for dbt have been created.

show roles;

show users;

show warehouses;

Setup dbt Cloud prerequisite

To use dbt, you will need a git repository and a data warehouse connection. Complete the follow steps to setup dbt Cloud prerequisite:

  1. Set up an empty repository without any files/folders in it (including a README) on your desired git provider. In this post, we will be using GitHub.
  2. You should have already setup Snowflake with users. We will be using username/password for credentials.
  3. Create a dbt Cloud account. You can sign up for one here.

Setup dbt Cloud development environment

Complete the follow steps to setup dbt Cloud development environment:

  1. Set up your connections by going through the project configuration pathway.
  2. Connect your Snowflake account.
  3. Connect your git repository. You can either connect via the Github app, via git URL, or use a managed repository. In this post, we are connecting via a Github app.
  4. Set up your development environment. You will be creating a development environment to be shared by all of your developers as well as setting up your personal development credentials to access that environment. We recommend selecting the latest version of dbt.
  5. Access the IDE (Integrated Development Environment).
  6. Click on the hamburger menu and click on Develop. This will spin up the IDE which is where you will develop your dbt project.
  7. Create your initial dbt project. Since your repository is empty, dbt will provide you with a green button that says, ‘Initialize a project’. Once you click on it, dbt will generate our starter project with the core files and folders.
  8. Now commit this work to your main branch. When you commit, dbt will then change the main branch to read-only so that you don’t accidentally commit to your main branch as you work.
  9. Check out a new git branch to start developing named ‘feature/currency_conversion_demo’.

Building currency conversion project

To configure currency conversion project, complete the following steps:

  1. Create a sources yml file under models directory with the following content.

version: 2

sources:

- name: BANKING

database: KNOEMA_BANKING_DATA_ATLAS

tables:

- name: EGMFIBS2017

- name: PUBLIC

database: MF79778_FX_DATA_SAMPLE_SHARE

tables:

- name: SAMPLE_FX_VIEW

2. Sources defined in the yml file allow you to query the source tables in your model using the {{ source()}} function and help to define the lineage of your data.

3. Create a sql file called balancesheetusd.sql under models directory with the following content.

with

foreign_exchange_rates as (

select * from {{ source(‘PUBLIC’, ‘SAMPLE_FX_VIEW’) }}

),

banking_data as (

select * from {{ source(‘BANKING’, ‘EGMFIBS2017’) }}

),

balance_sheet as (

select

“Indicator Name”,

banking_data.”Date”,

banking_data.”Value” * foreign_exchange_rates.”Close” as “ValueInUSD”

from banking_data

join foreign_exchange_rates

on banking_data.”Date” = foreign_exchange_rates.”Date”

where “Symbol” = ‘GBPUSD’

)

select * from balance_sheet

4. Three CTEs are used here following dbt coding conventions:

5. foreign_exchange_rates — select from foreign exchange rates data set using source function.

6. banking_data — select from banking data set table EGMFIBS2017 with balance sheet info in Pound sterling using source function.

7. balance_sheet — join foreign_exchange_rates and banking_data CTE to convert Pound sterling to USD.

8. The last select statement will materialize the model in a view when you run ‘dbt run’ to execute the SQL statement.

9. Choose preview data and you should be able to see the top 500 rows to validate the results. This does not create the object in Snowflake.

10. Choose Runs.

11. Enter dbt run command in the text bar and click `>`. This will compile and run the code in Snowflake.

12. Enter dbt docs generate in the text bar and choose >. This will generate the documentation for the project.

13. Choose view docs from the top left corner and a new browser tab should pop up.

14. This is the auto-generated project documentation site, hosted by dbt Cloud.

Validating the transformed data is loaded to Snowflake

The next step is to validate that the view balancesheetusd has been created with transformed currency data has been loaded successfully into Snowflake. Execute the following code on Snowflake, the result should show ValueInUSD.

use database dbt;

use role dbt_admin;

use schema <yourdbtschema>;

select * from balancesheetusd;

Summary

This post showed you step by step procedure on how to set up your first dbt project based on currency conversion use case using Snowflake Data Marketplace data sets. Next post will show you dbt and Snowflake best practices and how we implement the best practices on this project.

If you have any questions or suggestions, please leave a comment.

About the Authors

BP Yau is a Senior Partner Sales Engineer at Snowflake. His role is to help technology partners build cutting-edge integrations and solutions for joint customers. Before Snowflake, he was a Data Warehouse Specialist Solutions Architect at AWS to help customers architect big data solutions to process data at scale. He also helped Amazon.com Supply Chain Optimization Technologies migrate the Oracle Data Warehouse to Amazon Redshift and built the next generation big data analytics platform using AWS technologies.

Amy Chen is a Partner Solutions Architect at Fishtown Analytics (makers of dbt). Her role is to help technology partners and consulting partners further the field of analytics engineering with the use of dbt. She has spent the last few years consulting on matters of data infrastructure and analytics modeling at venture-funded and Fortune 500 companies.

--

--