Elastic, Cost Efficient CI/CD with Jenkins and Milpa

Published in

Elotl blog

8 min readApr 11, 2019

If you work as a developer, there is a good chance you know and use Jenkins, the leading open source automation server for CI/CD. The classic Jenkins setup involves a master and usually several agents: the master is responsible for coordination and provides the user interface (via a web UI and API), and the agents are designed to run the CI/CD jobs the user configured. The master and the agents in most cases run in separate virtual machines or containers.

This distributed setup is great for scalability: as your team and the number of builds and jobs they want to run grow, it is easy to add another agent node that can perform more work.

The classic distributed Jenkins architecture

The ability to scale to meet workload demands makes public cloud a great fit for Jenkins environments where capacity requirements vary over the course of the day or week. There are even Jenkins plugins that can manage this process on an on-demand basis, for example on EC2, or on Kubernetes. However, they might be slow to react to changes in capacity requirements, or still require static, pre-provisioned infrastructure that goes unused outside of peak hours and requires precious devops resources for performing upgrades, patching security holes, and so on.

At Elotl, our build infrastructure is based on our serverless engine called Milpa. The Jenkins master server runs as a Milpa pod, just like each Jenkins job — we don’t need any static VM or container instances to run agents. Jobs are ephemeral: Milpa will start a cloud instance for them when the job is kicked off by Jenkins, and the instance is terminated as soon as the Jenkins job has finished.

This setup has several advantages:

No compute capacity is wasted. The only two cloud instances that are always on are the one running the Milpa controller process itself, and the one running the Jenkins master pod.
Great scalability. Even if multiple builds are running in parallel, we are not constrained on some static infrastructure limit. Developers are never waiting for available agents and executors. Our build capacity is truly elastic, automatically and instantly scaling up and down.
Simplicity and security. As a small startup, we can’t afford to dedicate engineers to maintain infrastructure, or do operating system upgrades and patch security holes on it.
Minimal overhead. On the cloud instances running Jenkins jobs, there is no agent process — from the masters viewpoint, the jobs are running via one of the executor slots available on the master itself. Almost all the memory and CPUs on the cloud instances are available to the jobs.

Let’s take a look at how this all works under the hood, first checking out the high level setup, then via one of our internal builds (the CI pipeline for Milpa itself — we use Milpa to build Milpa), including its cost breakdown, as an example.

High level setup

The Jenkins master runs as a Milpa pod, i.e. in a cloud instance, via a Milpa deployment (so we can use rolling updates and ensure that there is always a running pod).

Whenever a new Jenkins job is triggered (e.g. via a commit, or a new pull request), Jenkins will create a new pod for the job (step 1 on the figure above).

Once the pod comes up, the build environment (environment variables, a clone of the source code repository, etc) configured for that particular job is copied over to the pod (step 2). The build script or command that is configured for the job takes over from here, and once it finishes, the master will take a look at the exit code to determine if the build failed or succeeded (step 3).

The deployment manifest for the Jenkins master in Milpa looks like this:

---
apiVersion: v1
kind: Deployment
metadata:
  name: jenkins-master
spec:
  replicas: 0
  template:
    metadata:
      labels:
        app: jenkins-master
      annotations:
        pod.elotl.co/milpactl-volume-name: 'milpactl-vol'
    spec:
      resources:
        memory: 2Gi
        volumeSize: 60Gi
      volumes:
        - name: data
          emptyDir: {}
        - name: milpactl-vol
          packagePath:
            path: milpactl
      initUnits:
        - image: elotl/restore-jenkins-config
          name: restore-jenkins-config
          command: ["/restore-jenkins-config.sh", "restore"]
          env:
            - name: SHARED_DIRECTORY
              value: /data
            - name: S3_BUCKET
              value: elotl-jenkins
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: jenkins-secrets
                  key: aws-access-key-id
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: jenkins-secrets
                  key: aws-secret-access-key
          volumeMounts:
            - name: data
              mountPath: /data
      units:
        - image: jenkins/jenkins:lts
          name: jenkins
          ports:
            - port: 8080
              protocol: TCP
              name: http-port
            - port: 50000
              protocol: TCP
              name: jnlp-port
          env:
            - name: JAVA_OPTS
              value: -Djenkins.install.runSetupWizard=false
          volumeMounts:
            - name: data
              mountPath: /var/jenkins_home
            - name: milpactl-vol
              mountPath: /milpactl
        - image: elotl/restore-jenkins-config:latest
          name: backup-jenkins-config
          command: ["./backup-daily.sh"]
          env:
            - name: SHARED_DIRECTORY
              value: /data
            - name: S3_BUCKET
              value: elotl-jenkins
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: jenkins-secrets
                  key: aws-access-key-id
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: jenkins-secrets
                  key: aws-secret-access-key
          volumeMounts:
            - name: data
              mountPath: /data

A couple of things here:

annotations:
        pod.elotl.co/milpactl-volume-name: 'milpactl-vol'

This annotation will have Milpa add a volume with milpactl (the command line tool for talking to a Milpa server) installed, so Jenkins can interact with Milpa. This allows the Jenkins master to launch new pods when a job is started.

initUnits:
        - image: elotl/restore-jenkins-config
          name: restore-jenkins-config
          command: ["/restore-jenkins-config.sh", "restore"]

This init unit will restore the configuration for the Jenkins master from S3 when the Jenkins master pod is restarted (e.g. when the deployment is updated).

There are two regular units as well:

units:
        - image: jenkins/jenkins:lts
          name: jenkins

This unit is the one running the Jenkins master process, using the official container image from the Jenkins project.

- image: elotl/restore-jenkins-config:latest
          name: backup-jenkins-config
          command: ["./backup-daily.sh"]

This unit performs a daily backup to S3 (so the init unit will have a fresh copy of the configuration to restore).

We add a managed build script to our Jenkins master that does all the work when it comes to running jobs in Milpa pods:

A managed script runs Jenkins jobs in Milpa pods

Now it is only a matter of choosing this script to build a project:

This is pretty much it —if we look at our EC2 console when the job is running, we will see that both the Jenkins master and the job runs as EC2 instances, managed by Milpa.

Milpa pods running the Jenkins master and a Jenkins job

Building Milpa on Milpa

The source code for Milpa is hosted on GitHub, and pull requests or merges into master will trigger a build on Jenkins. The build consists of multiple jobs:

First, it runs unit tests and integration tests. If they all pass, a build artifact is created and archived. The rest of the jobs in this build will all use this artifact.
Next, we run three jobs in parallel to perform system tests for high level features, such as our Kubernetes integration, and various workloads for acceptance testing.
If all the unit, integration, system and acceptance tests pass, a short job checks if this build is tagged as a release in git, and uploads the build artifact to our release bucket on S3 if it is. Voilà — the next release of Milpa is now available for download.

As for resource requirements and instance sizing, we wanted to allocate two vCPUs for each job in this build, since most of them are CPU intensive and can benefit from multiple cores. A cost efficient instance type on EC2 for this kind of workload is c5.large (which is chosen by Milpa automatically — the script starting the build job pod just specifies the number of CPU cores and the amount of memory it needs for the job; Milpa will determine which instance type is the most cost efficient one satisfying these requirements).

For maximum parallelism, we need to be able to run three jobs in parallel (our developers are pretty typical and don’t like waiting for builds). With a static infrastructure, that would mean three c5.large instances running Jenkins agents and one t3 instance running the Jenkins master. As of the time of writing this post, a c5.large instance costs $0.085 per hour, and the t3 instance is $0.0209 per hour. For one week the overall cost for CI/CD for this one build would be 7 x 24 x (3 x 0.085 + 0.0209) = $46.35.

When deployed via Milpa, the only instance that is continuously up and running is the Jenkins master (and the same master is used for all of our builds, but let’s ignore this fact for the sake of simplicity), costing us $3.51. Milpa also provides information on usage, let’s check a recent usage report for one week:

$ milpactl usage --start-date '2019-04-01 00:00:00' --end-date '2019-04-08 00:00:00' -l jenkins-build=milpa
USAGE      TYPE       HOURS
Instance   c5.large   32.634489
Storage    gp2        261.075912Report Period Start Date: 2019-04-01 00:00:00 +0000 UTC
Report Period End Date: 2019-04-08 00:00:00 +0000 UTC

The instances running our build cost 32.63 x 0.085 = $2.77 (not too bad, considering we ran approximately 100 jobs during this timeframe). The overall cost of the build for this week was thus 2.77 + 3.51 = $6.28 — on a monthly basis, we are saving more than $160 (that is more than 86%) over a static setup, for just one build (making our CEO very happy!).

Milpa can also leverage spot EC2 instances, further lowering the cost, but this comes with a caveat: the instances running the jobs might be terminated at any time. If rerunning a build is not a problem, this is a great way to save even more on infrastructure costs.

So here we go — ever since we implemented this setup, we have never had to worry about capacity planning or managing static pieces of infrastructure. Our builds happily run in an elastic fashion, with optimal parallelism, even if the same build needs to run in multiple instances (e.g. for multiple pull requests at the same time).

If you would like to see a step by step guide on how to set up Jenkins on Milpa, you can find one here.

Finally, if you are interested in setting up a CI/CD environment similar to this, let us know if we can help (or go ahead, download our free community edition and try it out yourself).

Elastic, Cost Efficient CI/CD with Jenkins and Milpa

High level setup

Building Milpa on Milpa

Written by Vilmos Nebehaj