written together with Alhewpl

CI/CD to OTC using Github Actions

Published in

metr tech

8 min readFeb 9, 2021

An article about how we are integrating and deploying our code at Metr.

Are you using OTC as a cloud solution and you wonder how to deploy to it through a CD pipeline? As newcomers at metr, we were often challenged with understanding the infrastructure ecosystem and the decisions that were taken before we joined. Adding a continuous integration and deployment pipeline was the perfect project and the excellent opportunity to go through all of these step by step. We did that by using a tool that we had yet to discover. What follows is our journey and our learnings. 🚀

Introduction

With the objective to satisfy our clients in an effective and efficient way and to productively achieve our tasks, we use agile methodologies to develop new features and do maintenance work on our products. Our products and team have recently gone through a growth spurt and will continue to do so in the foreseeable future. Our goal was, therefore, to implement an automatic way to integrate and to deploy our code in order to facilitate agile development and eliminate the stress of deployment.

Note: Before going further, it is good to highlight that we use git-flow as a strategy to add new features and integrate them later to the existing code in different projects. It is also important to mention that we will be presenting how to integrate and deploy a Django project to staging.

Project Overview

Every time a developer wants to merge a new feature branch to the develop branch, we need to run all unit and integration tests to make sure that nothing breaks with the new code changes. If something goes wrong, it’s important to get notified in order to resolve that and it’s also important to prevent the merge. This is what we call the Integration part.

After integrating the code, the developer needs to deploy it to staging first. We are using OTC Cloud Container Engine, so deploying to OTC means pushing the project’s image from a docker client to the CCE cluster, then deploying the related service using Kubernetes command. This is what we call the deployment to staging part.

image 1: schema of the ci/cd pipeline main steps when the tests run well

The integration part is always executed when a PR is created or an update is pushed to an open PR and the PR is merged. The deployment part is only executed when the PR is merged and after the integration part has succeeded without any issues. No jobs will be run on a draft PR.

We prevent merging a PR if tests do not pass and we warn the developer when merging manually by adding rules to the branch develop like explained in the documentation about how to enable required status checks.

We require only running tests before merge like shown in the following picture.

image 2: A rule that requires tests to be successful before merging a PR

If tests are not successful, automatic merging is prevented. Also, the developer will be informed and they need to make confirmation before manually merging.

image 3: A warning that prevents merging a PR when tests are failing

Our CI/CD Tool: GitHub Actions

We opted for GitHub Actions for many reasons:

First, GitHub was offering many ready-to-use actions in its marketplace. While skimming through it, we found already available actions to suit our needs and use cases, so we used them to build our workflow file as compact as possible.
Secondly, it was easy to make this choice as GitHub is our code collaboration and version control tool. In addition, it’s already offering 3,000 minutes/month of CI/CD pipelines for our private repositories.
Third, we needed to trigger our pipelines directly after some git operations, especially pushing to an open PR or merging a feature to the develop branch or merging a release to the master branch.

Our basic workflow steps

This is how we are setting our workflow to automate integration and deployment to staging for a Django project:

We first set the name of the workflow:

name: Staging CI/CD

2. Then we define what actions will trigger the workflow. In our case, we wanted on all open PRs and on all pushes to the develop branch

on:   pull_request:      branches: [ develop ]   push:      branches: [ develop ]

3. We decided to have environment variables at the workflow level because we’ll be making use of them at different steps throughout our jobs. Most of these variables have dummy values and some of them are used for building the app container used for testing.

Example:

env:   SECRET_KEY: secret   NAME: project-name   TAG: tag

4. We then define our jobs, in our case one job for integration and another one for deployment.

Integration job

First, we name the job:

jobs:   running-tests:

2. Second, we only want to execute the job on a given condition, namely on a push event and on a PR that’s not a draft :

     if: github.event_name == 'push' || (github.event_name == 'pull_request' && github.event.pull_request.draft == false)

3. Then, we define the OS for the GitHub-hosted virtual machine in which our dependencies will be installed and our tests will be executed

     runs-on: ubuntu-latest

4. it’s possible to define services as Docker containers to be used in the job. In our case, we are defining Postgres with a certain version since we will need this service for our tests.

     services:       postgres:         image: postgres:12.4         env:           POSTGRES_USER: postgres           POSTGRES_PASSWORD: postgres           POSTGRES_DB: postgres         ports: [ '5432:5432' ]

5. Finally, we set the following steps to be executed:

   steps:     # First we checkout the working branch:     - uses: actions/checkout@v2     # We setup our Python environment:     - name: Set up Python 3.8       uses: actions/setup-python@v2       with:         python-version: 3.8

We use the built-in cache action for dependencies in order to optimize the execution time, as we’ll reuse the cached dependencies we installed in previous runs. We update the key name by upgrading the version each time our list of dependencies is updated.

     - name: Cache python dependencies       id: cache-python       uses: actions/cache@v2       with:         path: ${{ env.pythonLocation}}         key: ${{ env.pythonLocation }}-pip-${{ hashFiles('**/requirements-dev.txt') }}-v1         restore-keys: |          ${{ env.pythonLocation }}-pip-          ${{ env.pythonLocation }}-    - name: Install python dependencies      if: steps.cache-python.outputs.cache-hit != 'true'      run: |         python -m pip install --upgrade pip         pip install -r requirements-dev.txt

6. Our next step is to run the tests and this is where we’ll make use of the Postgres service that we defined earlier by assigning it to the DATABASE_URL variable

     - name: Run tests       env:         DATABASE_URL: 'postgres://postgres:postgres@localhost:${{ job.services.postgres.ports[5432] }}/postgres'       run: pytest

7. We send a notification in case our job fails

     - name: Slack Notification       uses: rtCamp/action-slack-notify@v2       env:         SLACK_CHANNEL: pipelines         SLACK_ICON: ":octocat:"         SLACK_USERNAME: "Github Actions"         SLACK_COLOR: danger         SLACK_MESSAGE: "Your message of failure here"         SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK_URL }}      if: failure()

Deployment job

First, we name the job:

   build-deploy-to-staging:

2. Second, our job will be executed only when merging the PR, in other words when pushing changes to develop and only if the first job succeeded:

      if: github.event_name == 'push'      needs: running-tests

3. Third, we define the OS for the GitHub-hosted virtual machine in which our job steps will be executed :

      runs-on: ubuntu-latest

4. Then, because data is not persisted from job to job, we need to check out the working branch again:

      steps:        - uses: actions/checkout@v2

5. In order to publish our docker image to OTC, we login first to OTC using a long-term valid login command that we use as LOGIN_KEY here. Then, we build, tag and push our image to OTC.

        - name: Build, Tag and Publish docker image          run: |            docker login -u ${{ secrets.PROJECT_NAME }}@${{ secrets.AK }} -p ${{ secrets.LOGIN_KEY }} ${{ secrets.IMAGE_ADDRESS }}            docker build -t $NAME .            docker tag $NAME ${{ secrets.IMAGE_ADDRESS }}${NAME}:${TAG}            docker push ${{ secrets.IMAGE_ADDRESS }}${NAME}:${TAG}

6. We separated our infrastructure-related code in a different repository, so in order to execute Kubernetes commands, we need to checkout to it.

        - name: Checkout to infrastructure repo          uses: actions/checkout@v2          with:            repository: metr-systems/infrastructure-project-name            path: infrastructure            token: ${{ secrets.DEPLOY_SECRET }}

7. Since we are using specific versions of Kubernetes and of the kustomize wrapper, we opted for this action that allows us to specify the needed versions.

        - name: Setup kubectl & kustomize with needed versions          uses: yokawasa/action-setup-kube-tools@v0.2.0          with:            kubectl: 1.13.10            kustomize: 3.0.0

8. We need to copy our Kubernetes configuration into the running ci/cd container in order to execute kustomize and kubctl commands

        - name: Deploy & Restart staging services          run:  |            mkdir -p ${HOME}/.kube            echo "${{ secrets.KUBE_CONFIG }}" | base64 -di > ${HOME}/.kube/config            kustomize build path/to/staging | kubectl --namespace=staging apply -f -            kubectl --namespace=staging scale deployment project-name --replicas=0            kubectl --namespace=staging scale deployment project-name --replicas=1

9. We send a slack notification both when the job succeeds or fails.

        - name: Slack Notification          uses: rtCamp/action-slack-notify@v2          if: always()          env:            SLACK_CHANNEL: pipelines            SLACK_ICON:  ":octocat:"            SLACK_USERNAME: "Github Actions"            SLACK_COLOR: ${{ job.status == 'success' && 'good' || 'danger' }}            SLACK_MESSAGE: "This job has ${{ job.status == 'success' && 'passed!' || 'failed.' }}"            SLACK_TITLE: "Github Actions Build"            SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK_URL }}

You can find the full basic workflow that we explained in this gist.

Our notifications system

We preferred the rtCamp/action-slack-notify@v2 action to others because the notification took into consideration the github_actor and that’s important to us as through our pair-programming sessions, we often have more than 1 person committing code to the same branch. Other notification actions only recognize the PR author and that would be discrediting the work of everyone else contributing to the PR, unless they’re the one who opened it.

At metr, the culture of communication is based on awareness and no-blame, that’s why notifications are sent to the whole team in a dedicated slack channel. For the integration part, to reduce spamming, we collectively decided notifications to be sent only when tests fail. For the deployment part, notifications are sent in either case of success or failure.

Summary

After intense research and trying out many different actions for the steps that are needed in our pipeline, not only did we discover the tool, but we understood a great deal about the systems that help make our contribution a meaningful one. Of course, GitHub actions has its downsides, one of which is not being able to test the runs locally and therefore consuming minutes when building the pipeline can be stressful. But on the upside of things, the tool is developing rapidly and it gains more and more traction with more and more user-contributed actions for all sorts of tasks. Thanks to all of those whose actions helped us complete our pipeline …and who knows? Hopefully, our next contribution will be a metr-built action that will help others quickly and hassle-free integrate a very needed step.