CI/CD at Ai Incube

Published in

aiincube-engineering

7 min readMar 13, 2020

In a context of active software development it is crucial to have a continuous integration / continuous delivery (CI/CD) system that helps us keep our software stable and allow us to deploy new versions at any moment.

At Ai Incube we are using Kubernetes on Google Cloud Platform (GCP) and below I’ll show how we implemented our CI/CD system on top of that. We’ll define what it does, then we’ll see how we built the necessary parts on top of Google Cloud and finally we’ll see those parts working together to cover the functions of a CI/CD system.

What is our CI/CD doing?

We have a Gradle multi-project with a test suite containing unit and integration tests that takes a few minutes to run. Integration tests run against several PostgreSQL databases.

Our CI/CD system is in charge of:

Running those tests whenever a commit is pushed to the source repository.
Creating container images for the artifacts we deploy.
Actually deploying those artifacts in the production cluster.

How is it built?

Lean back and relax. This is going to be long because I will explain not only what we finally built but also what we built in the process that didn’t work for us, in the hope that I can spare you some hours.

GCP provides you with access to Cloud Build, which allows to run “scripts” in a machine that GCP starts ad-hoc. Scripts have steps, and each step runs a command in a container image. An example of such script could be the following one:

steps:
- id: ‘Run the gradle build’
 name: ‘gcr.io/cloud-builders/gradle’
 args: [ ‘build’ ]

which consists on a single step that runs gradle build using their Gradle image: gcr.io/cloud-builders/gradle.

There are official images, which contain among others the previous Gradle image, and community images which contain many other things.

Warning: official images are in the cache of the machine running your script and that’s quick. Community images are not (at least not completely) and that adds some overhead to the build time, as the image has to be pulled.

Gradle build

In order to make a Gradle build we need:

The code
A Cloud Build script running gradle build in their gcr.io/cloud-builders/grade image.

With command line tools you can submit scripts to Cloud Build:

gcloud builds submit — config=mybuild.yaml

Where mybuild.yaml is the file containing the build script. But, how can the build access your code? Whereas the gcloud command also has a –source parameter, this is not the approach we want to follow because we want this to be run automatically whenever a commit or a tag is created.

We need to somehow configure our source code in Cloud Build, and this is done with Triggers. With Triggers you can listen to repository events (creations of commits or tags) and run Cloud Build scripts existing in that repository. When the repository event takes place, the Cloud Build process will first of all checkout the code and afterwards will run the Cloud Build script in the configured location.

The following image shows a trigger configured to run the /cicd/run-tests.yaml script.

That script can have this content:

steps:
- id: ‘Run the gradle build’
 name: ‘gcr.io/cloud-builders/gradle’
 args: [ ‘build’ ]

Which would run gradle build in the checked out code.

Warning: If you don’t have permissions to see the logs of your builds, use logsBucket and point to a Cloud Storage bucket you have access to.

steps:
- id: ‘Run the gradle build’
 name: ‘gcr.io/cloud-builders/gradle’
 args: [ ‘build’ ]
logsBucket: gs://my-build-repo/cloudbuild-logs

Accessing the repository

Configuring the repository may not be straightforward though. If the repository is public you’re done but if it is private and it is not in Google Cloud Source Repositories you have a bit of fun ahead.

There are instructions on how to give Google Build an SSH key to connect to GitHub but that is not how we did it. We have the code in BitBucket and Cloud Build offers the possibility to mirror BitBucket repositories into Cloud Source.

When configured, it creates a repository in Cloud Source that mirrors the content of the repository in BitBucket. That’s what we did and that’s why in the previous image you can see on the top left “Source: Bitbucket (mirrored)”. Once the repository is mirrored it is listed in the Triggers user interface and can be selected to create a Trigger.

Environment for integration tests

Once the trigger is created you can push a commit. The commit will be detected by the trigger, the build will run on the specific commit and all your unit tests will be executed. Hopefully they will pass and a nice green check will appear in the build. But it will work only unit tests.

Integration tests requiring some external components, for example a database, will fail because there is no such component in the Cloud Build machine. How to solve this problem?

First, it is possible to start a database in one step and connect to it in the next. Then there is some documentation about how to run integration tests with the required environment. They propose three options:

Use docker-compose to create the environment in the machine.
Make the build script deploy a container that runs the tests in the cluster of your choice.
“Deploy to a self-destructing VM”. The documentation here is not very detailed at all.

We have a test cluster with a lot of data preloaded so we opted by option two in the previous list.

In one of Cloud Build examples they deploy the artifact to be tested into the cluster and then they run some curl based shell script that test it. But for those of you that, like us, write your integration test with JUnit, this approach is not valid because we both need the code and Gradle. Instead, we created a Cloud Build script that:

Checks out the code (done automatically if the build is in response to a trigger)
Builds an image based on Gradle that contains the checked out code.
Deploys the image in the container registry.
Runs a Kubernetes job using the previous image that runs gradle build.

That will work well if you don’t mind the overhead: just to invoke Gradle your build has to checkout the code, create the container image, upload it to the container registry, the cluster has to pull it and the Job has to start. Only then Gradle starts to download the dependencies, which may add a few minutes more.

If you go this way, there are a couple of community cloud builders that can help you save the Gradle cache in a Cloud Storage bucket between executions and spare you a few minutes.

We went this road for a while, but one day we moved to something better.

Jenkincito

Jenkincito could be translated as “little Jenkins” in Spanish. It’s called after Jenkins, the well known CI/CD tool, and because it’s a modest approach to CI/CD. It’s very specific and probably we’ll need something broader one day, but so far it’s doing its part well.

What does Jenkincito do? It is a Kubernetes deployment in our test cluster that keeps a pod running with the ability to checkout code from our repositories and to run Gradle commands.

Instead of starting a pod each time with an empty Gradle cache, checking out the whole repository, etc. we keep the pod alive forever. Thus, not only do we have a Gradle cache but we also have the last Gradle build and sometimes it will skip running unnecessary tests if the Gradle task is up to date.

The Cloud Build script looks like this one:

steps:
- id: ‘run the build’
 name: ‘gcr.io/cloud-builders/kubectl’
 entrypoint: ‘bash’
 args:
 — ‘-c’
 — |
 # configure gcloud
 gcloud container clusters get-credentials — project=”my-project” — zone=”my-zone” “my-cluster” # Get name of pod running jenkincito
 podname=$(kubectl -n dev get pods -lapp=jenkincito — field-selector=status.phase==Running -o jsonpath=’{.items[0].metadata.name}’) # Tell jenkincito to run the pod
 echo “asking pod $podname to do the build with my-repo $BRANCH_NAME, $COMMIT_SHA”
 kubectl -n my-namespace exec -it $podname — /home/gradle/src/run-jenkincito.sh my-repo $BRANCH_NAME $COMMIT_SHA build

The uppercase variables (BRANCH_NAME, COMMIT_SHA) are resolved by the Cloud Build environment.

Putting it all together

With Jenkincito running in our test cluster, when a commit is pushed to the repository:

A Cloud Build trigger runs a script that locates Jenkincito and asks it to run the tests.
Jenkincito will checkout the code and run gradle build.
The Cloud Build script exit status will be that of the Gradle build and a message will be sent to the cloud-builds Pub/Sub topic.
We channel cloud-build messages to a Slack channel, so soon after we push our code we get some feedback on the channel.

Similarly, when a tag is pushed to the repository the process is exactly the same, only that instead of running gradle build, Jenkincito runs gradle jib. We use jib to build our images.

The last step would be to deploy a specific artifact based on those images whenever they are created. That involves another component I haven’t talked about so we will leave that for another post.