DBT VSCode Development Environment

Accelerate onboarding and align developer environments with Visual Studio Code Development Containers

Jay Lewis
Feefo Product Engineering
5 min readJun 8, 2022

--

There are many fantastic articles about DBT - if you don’t know what it is then bookmark this page, see https://www.getdbt.com/ and come back later 😄

A quote from the official documentation I show on presentations is:

dbt™ is a transformation workflow that lets teams quickly and collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation. Now anyone who knows SQL can build production-grade data pipelines.

This is great for getting buy-in from the company board, but hides a classic truth — with great power comes great responsibility.

vscode and dbt logos

At Feefo we’re just starting it’s journey into Analytics Engineering concepts and how the Data Engineering team can efficiently support the rest of the business through self-serve. One of the problems of creating a self-serve data analytics platform with DBT is that those who are self-serving aren’t as technical, generally, as those creating the platform. This means more people are exposed to those problems that engineers are now an old hat at fixing, but without the experience to fix them — such as:

  • Python versions are… always just a mess
  • Mismatched package and library versions
  • Standardising code styles across teams
  • Working across different OS’s (our Analysts are PBI pros and work on Windows machines, Engineers work on Macs)
  • Running code on different environments / testing
https://xkcd.com/1987/

Ultimately resulting in failed deployments, weird edge cases, and ‘but it works on my machine’ being shouted across the office.

Our job as Data Engineers within Feefo (/DataOps is the cool word now, isn’t it?) is to make the platform as easy to use as possible and solve these problems. Enter — Visual Studio Code Development Containers!

VSCode Development Containers are amazing, and I use them for everything from local python tinkering to locally running/reviewing PRs without even needing to switch my local branch. I think of them as “venv but better”. They are further enhanced by GitHub Codespaces if your current work machine is a potato 🥔 or you’re given a Windows machine to develop on.

Excited? Let’s get going.

Prerequisites

  • Visual Studio Code
  • Remote - Containers Extension (bundled here)
  • Docker Desktop

Up to date list here

Setup

Create a dockerfile which acts as the base of our development container. Fishtown have an official image which contains a matching python install, so just grab that and install anything extra via a requirements.txt as normal.

[Dockerfile]FROM fishtownanalytics/dbt:1.0.0# Install pinned python versions to image
COPY requirements.txt /tmp/pip-tmp/
RUN pip3 --disable-pip-version-check --no-cache-dir install -r /tmp/pip-tmp/requirements.txt \
&& rm -rf /tmp/pip-tmp
# Copy dbt code in (jaffle-shop example dir)
WORKDIR /dbt
COPY jaffle-shop/ ./
# Run DBT deps to collect pinned versions of DBT packages
RUN dbt deps --profiles-dir .

And if you use dbt packages, they will be installed into the image by the dbt depsline above, and are listed in jaffle-shop/packages.yml

[jaffle-shop/packages.yml]packages:
- package: calogica/dbt_expectations
version: 0.5.5
- package: dbt-labs/audit_helper
version: 0.5.0
- package: dbt-labs/dbt_utils
version: 0.8.4

A development environment is controlled by a devcontainer.json file at the root of your project

[.devcontainer/devcontainer.json]// For format details, see https://aka.ms/devcontainer.json{
"name": "DBT",
"dockerFile": "../Dockerfile", # Relative path
// Set *default* container specific settings.json values on container create.
"settings":{} # Much more to come in here for article two
// Comment out connect as root instead. More info: https://aka.ms/vscode-remote/containers/non-root.
"remoteUser": "vscode"
}

So our final simplified repo looks like:

.
├── .devcontainer
│ └── devcontainer.json
├── Dockerfile
├── requirements.txt
├── jaffle-shop
│ └── models
│ └── packages.yaml
│ └── ...etc

Open Container

Once in place, you get this handy pop up from VSCode:

Or CMD/Ctrl + Shift + P to open the context menu (or F11) and select Remote-Containers: Rebuild and Reopen in Container

On the first run, it will take a few minutes to build the image, once the window refreshes you’re in. On subsequent runs it’s almost instant.

Once in the container just develop as normal, with a full VSCode experience.

This ticks the first two problems off the list - Python Environments and DBT Package versions are controlled, standardised across all developers, and can’t go out of sync 🍾

Extras

To actually run against GCP it’s handy to add in a user’s local .dbt profile and GCP gcloud application-default credentials to easily use OAuth.

[~/.dbt/profiles.yaml]
jaffle-shop:
target: dev
outputs:
dev:
type: bigquery
method: oauth
project: gcp-project-name
dataset: dbt_jay # You can also use "schema" here
threads: 1

Mount dbt profile and credentials into container home

[.devcontainer/docker-compose.yaml]volumes:
- ~/.config/gcloud/application_default_credentials.json:/tmp/keys/application_default_credentials.json:ro
# Copy all code into container
- ../:/workspace:cached
# Note the 'home' location should match the devcontainer 'remoteUser'
- ~/.dbt/profiles.yml:/home/vscode/.dbt/profiles.yml
# Set environment variable within container
environment:
GOOGLE_APPLICATION_CREDENTIALS: /tmp/keys/application_default_credentials.json

Then just edit the devcontainer.json to use the compose file instead

[devcontainer/devcontainer.json]{
"name": "DBT",
"dockerComposeFile": "docker-compose.yml",
"service": "app",
"workspaceFolder": "/workspace",
"settings": {}
...
}

Allowing a user to simply run dbt run like normal as dbt looks in ~/ by default, as shown by dbt debug

vscode@0630f03cdf5d:/workspace/jaffle-shop$ dbt debug
...
Using profiles.yml file at /home/vscode/.dbt/profiles.yml
...

and their personal OAuth credentials and permissions will be used instead of any elevated service account required for production runs.

Onboarding

When a new developer joins the team it’s incredibly quick to get them up and running. As long as they have Docker and VSCode installed everything else comes bundled in the image, pinned at the right versions. Ready to contribute on day 1, not day 15 🚀

Follow me to catch article two in this series — automatically enforcing DBT code styles using standardised settings and extensions within the container.

Go beyond the stars with
Feefo Reviews

Collecting reviews is just the beginning. We’re here to help you translate your customer feedback into actionable insights that will increase sales, boost traffic and improve your brand’s reputation. All Feefo Reviews are provided by verified buyers, helping you provide better experiences to your customers.

We’re hiring!

https://www.feefo.com/en/business/about/careers

--

--