Using Liquibase DevOps pipeline to automate deployments for Snowflake
This segment explaining how Liquibase can be used for database deployments automation using DevOps pipeline approach for Snowflake (and BigQuery)
With the increasing use of Cloud based databases many companies are now focusing on developing a scalable and secure DevOps solution to manage their databases deployment.
I would not indulge to explaining CI/CD and DevOps again, I am sure they are many blogs around this topic. But what I would like to focus is on how can we develop a DevOps solution for your database.
In this article, we would be focusing on developing a solution for Snowflake, but this same setup can be used for BigQuery or any other cloud bases database too.
Firstly, let’s understand how this setup can be beneficial for the Developer Community & Organization’s dream to be data driven.
- Reduce the overhead over the database admin teams to run the scripts manually
- Avoid manual changes in the scripts due to different nomenclature and naming standards
- Reducing the turn around time for deployment
- Better security setup by avoiding admin roles being released across different team
- Align rate of changes in the application and database
- Tracking database code changes
Next, lets dive into the different components that will be used to design this solution.
GitHub
GitHub is a code hosting platform for version control and collaboration. You can find more details and how to get started with it using this quick guide. (You can use any GIT tool like Azure DevOps, Gitlab etc of your choice too)
Jenkins / GitHub Actions
These are continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline. This will help you connect your deployment code to your database landscpae platform.
Liquibase
Liquibase is a database schema change management tool. Simply put, it is the deployment engine that handles versioning, roll-back etc.
Snowflake / BigQuery
These are the Enterprise data warehouse solutions that runs completely on cloud infrastructure.
Setting up tools:
Now that we know what all components are used in the design, let’s start with implementing each step.
In this setup, we will use our local machine as the host for all deployment.
At an Enterprise scale, some of the setup for these component would require a slight tweak in order to accommodate the scale and security.
Step 1:
Let’s start with setting up GitHub, we would need GitHub(or any GIT) repo to manage all code deployments and setup files. We should create the core branches of the repository matching the enterprise database landscape.
For example, if your database landscape consist of DEV-TEST-QA-PROD systems, then you will need to create these branches in your GitHub repo.
These branches will act as your core branches and should be protected with merge only via Pull Request Approvals
Step 2:
Now each of these GitHub repo branches correspond to a jenkins pipeline. This setup is completed using Multi-Branch pipeline in jenkins.
The detailed setup of how to connect GitHub with Jenkins multi-branch pipeline can be found in the post below.
Step 3.1:
How Liquibase fits into the entire design.
Lets start with understanding how Liquibase works. Changeset is the fundamental block of the liquibase setup as every deployment or change is basically a changeset. These changesets together create changelog which help to track deployments in the database.
Step 3.2:
Next we need download the liquibase files and store them in a location which can be shared / accessible by the sql files.
Step 4:
All these setups are added to jenkins pipeline to trigger the changes to the target database. To summarize the steps,
1.] Download the GitHub repository with the sql files to be deployed
2.] The GitHub repo should store all the already deployed and to be deployed sql files. These file names are read by the pipeline
3.] Using these sql file names, we will create the CHANGELOG file used to deploy the changes in Snowflake
4.] Ensure all Liquibase files are available on the deploying agent
5.] For Snowflake, we need to use service account or separate system admin account to deploy the changes. The pipeline will set the Warehouse and Role context before deployment and release the roles after deployment to ensure governance and security
6.] The changelog helps to track which sql files have already been deployed and only new and changed sql files are deployed to Snowflake.
This framework is very flexible and robust to incorporate any enterprise needs and ensure all governance and security measures are followed.
We can also add certain governance steps like creating and closing a CHANGE REQUEST by calling a API endpoint through our DevOps pipeline.
For any additional information on what is Liquibase and how it works, please stay tuned, I am going to soon publish it soon.
#data #dataengineering #datasolutions #devops #solutionarchitecture #snowflake #bigquery #liquibase #github #git