How to define deploy order in a Monorepo using Gitlab CI/CD

Zachary McDonnell
4 min readJan 22, 2024

--

Gitlab CI/CD has a lot of functionality. However, with a monorepo the matrix of dependencies and build order can test the limits of what’s possible with CI/CD pipelines. At Hona, we’ve been a big fan of our typescript monorepo. The shared code and types from the database all the way to the frontend has played a large part in our ability to move fast without sacrificing quality. That said, the build and deploy sequence for any change can become complicated to determine and difficult to implement with the given feature set in Gitlab CI/CD.

Gitlab CI/CD

Defining Deploy Sequence in Gitlab CI

Take for example the following deploy stage from the .gitlab-ci.yml file.

# .gitlab-ci.yml
# no deploy order

db-deploy:
stage: deploy
script:
- echo "Deploy Database Change"

api-deploy:
stage: deploy
script:
- echo "Deploy API Change"

Here we have two deploys. One for the database, and one for our backend api. In the current setup each one will deploy during the deploy stage at different times. This might be fine most of the time but what if you need the database change to be deployed first before the api change can be deployed? We could write it like this:

# .gitlab-ci.yml
# deploy order with the `needs` keyword

db-deploy:
stage: deploy
script:
- echo "Deploy Database Change"

api-deploy:
stage: deploy
needs:
- 'db-deploy'
script:
- echo "Deploy API Change"

Now we’ve defined the deploy order but we’ve introduced another issue. We must always run the db-deploy job even if there is no db-deploy change. In this simplified example this might be ok but what happens when we have multiple different deployments? Now instead of running the one deployment you must run every deploy before the one you want. This increases wait times when you are trying to deploy to prod. Well here’s the next idea. Separate Stages for each deploy:

# .gitlab-ci.yml
# deploy order with stages

stages:
...
- deploy-db
- deploy-api

db-deploy:
stage: deploy-db
script:
- echo "Deploy Database Change"

api-deploy:
stage: deploy-api
script:
- echo "Deploy API Change"

Does it work? Yes. Now the api-deploy will run after the db-deploy if it is present but the api will also deploy if the db-deploy is not present. But now we run into another issue. For every new service with a deployment we must add a new stage (or at least we must group each service into the deploy order).

Honestly this could be a good solution that could take you far but something about adding new stages for every deploy didn’t sit well with me. It feels like we are trying to force something to work that wasn’t designed for this purpose of deploy order sequence. Wouldn’t it be nice to have an optional needs category where you can define the job order if the job exists? Unfortunately, no, you cannot with Gitlab CI.

Use Downstream Pipelines to define Deploy Order

Downstream Pipelines are the next step in managing deployment complexity. Instead of being limited to the current feature set of Gitlab CI/CD YAML Syntax, you can run a script that can generate whatever pipeline yaml file you want. With this script you could define these conditional dependencies. Here’s what it looks like:

# .gitlab-ci.yml
# deploy order with a dynamic pipeline

stages:
- pipeline-generate
- pipeline-trigger

generate-pipeline:
stage: pipeline-generate
script:
- ts-node generate-pipeline.ts
artifacts:
paths:
- pipeline.yml

trigger-pipeline:
stage: pipeline-trigger
needs: ['generate-pipeline']
trigger:
include:
- artifact: pipeline.yml
job: generate-pipeline
strategy: depend

Ok so let’s break this down. We have two jobs. One to generate the pipeline and another to execute the pipeline. In the generate pipeline we run a script that generates the pipeline. That script could look something like this:

// generate-pipeline.ts

import { writeFileSync, readFileSync, readdirSync, existsSync } from "fs";
import { parse, stringify } from "yaml";
import path from "path";

let pipelineYml = parse(readFileSync(path.resolve(`${__dirname}/defaults.yml`)).toString())

const appsDir = './apps';

const apps = ['db', 'api'];

function hasChangeForApp(app: string): boolean {
// run your own code to determine whether the app should be deployed
return true;
}

async function compile() {
const deployChains = {
"api-deploy-job": ["db-deploy"],
};

// Add the deploy chains to the `needs` keyword of downstream jobs
// This defines the deploy order
Object.entries(deployChains).forEach(([jobName, needs]) => {
needs.forEach((need) => {
if (hasChangeForApp(need)) {
console.log(`Adding ${need} to needs of ${jobName}`);
pipelineYml[jobName].needs?.push(need);
}
});
});

const pipelineFile = path.resolve(`${__dirname}/../../pipeline.yml`);
console.log("Writing pipeline.yml");
writeFileSync(pipelineFile, stringify(pipelineYml));
console.log("Done");
}

compile();

Now with this solution we can add these conditional needs without defining multiple stages. This optional dependency is only defined when it needs to and the deploy order is preserved.

When does this make sense?

Yes it is more complicated on the surface. For simpler projects I think using separate stages makes a lot of sense but as the number of applications in our monorepo has grown we’ve found that we want more finer control over how we trigger jobs. Let me explain. We use turborepo to cache our builds. This cache saves us a lot of time and we wanted to take advantage of turborepo’s turbo-ignore command to determine whether we should trigger a deploy or not. How? This command plugs directly into the dependencies defined in each package.json to trigger a build. That breaks us free from the Gitlab CI/CD changes keyword where we are limited to wildcard paths.

In the actual production code for our dynamic pipeline we actually add in a call to turbo-ignore for each app to determine if we should include jobs for that app in the pipeline.yml file. Because we are looking at the package.json dependencies, we have context on the relationship between monorepo packages and our deployable apps. So when our developers create or add new dependencies in the monorepo, our pipelines will intelligently detect and trigger the appropriate builds based on those new dependencies. Now that’s one less thing to keep track of during development!

--

--

Zachary McDonnell

10+ years experience as a software engineer. Especially interested in Dev Ops, Machine Learning, and high quality System Architecture.