Coding Tips: Patterns for Continuous Integration with Docker on Travis CI

Part 1 of 3: The basics

Reach Digital Health
MobileForGood
9 min readJul 17, 2017

--

Update: Parts 2 (The “Docker repo” pattern) and 3 (Python tools for tagging & testing) have now been published.

We are longtime users of Travis CI’s continuous integration service at Praekelt.org. We host the source code for almost all of our software projects on GitHub and have Travis CI run automated tests as soon as the code there is updated. This helps us ensure the quality of our code and simplifies the automation of related tasks, such as releasing new versions of our software. Another advantage for us with Travis CI is that the service is free for open source projects and most of our code is open source.

The Springster Free Basics website, which is deployed using Docker on Travis CI

More recently, we’ve started making heavy use of Docker containers. Docker containers package software together with all of its dependencies and provide a single, simple entry-point for running that software. This means that software runs more consistently and much less work is necessary to prepare the systems that the software will run on. Essentially, Docker containers make it possible to ensure that our software runs the same way wherever we use it — whether on a developer’s laptop in Cape Town or a production server in Lagos. Docker has been great, but it requires a rethink of existing automation and continuous integration workflows. This series will share some of the lessons we’ve learnt around using Docker on Travis CI.

The steps described in this guide are used to package mobile websites based on our Molo CMS. These processes have allowed our developers to deploy thousands of containers over the past two years.

Some familiarity with Git, Travis CI, and Docker is expected. The examples given are as simple as possible while still following best practices.

Starting out: running Docker

Perhaps the most important thing to keep in mind when it comes to using Docker on Travis CI is that there aren’t really any tools or integrations provided by Travis CI to make working with Docker easier. Essentially, you only receive the plain Docker command-line interface that you would have on your local machine.

Travis itself, of course, has some documentation on using Docker, but most of the examples are a little complicated. Let’s start out with something super simple. Say you have a project (probably in a GitHub repository) with two files in it: a Dockerfile and a .travis.yml file.

The Dockerfile could look like this:

Your Dockerfile defines the steps necessary to build a Docker image. Let’s walk through this Dockerfilequickly before moving to the .travis.yml file. Going line by line:

1: We start FROM a Debian Linux base image.

2: We install our software. In this case we install Cowsay, a useless but fun piece of software for making cows say things.*

5: We make sure we can run games like Cowsay easily by adjusting the PATH environment variable.

7: We set the container to run Cowsay when it is started.

8: We tell the cow what to say.

* This apt-get command is a little more complicated than usual but that’s unfortunately the way things are in Docker-land. Read more here.

The .travis.yml file you’d write may look like this:

Let’s go through this before showing you how the build comes out. Section-by-section:

sudo: On Travis, if you want Docker, you need to have sudo capabilities.

services: We want to have Docker running.

script:

  • We use the docker build command to build the image in the current directory. We give the image the tag (or “name”) of myimage.
  • We run the Docker image, i.e. we start the container.

after_script: We run the docker images command. This is not necessary but can provide some useful information about all the Docker images on the machine, such as how much disk space they take up.

Here’s a link to a Travis build that builds this. The full output is too lengthy for this blog post, but here’s what that docker run myimage command does:

Yay! Talking cows! What a gimmick!

A note on terminology: we’ll use the phrase “Travis file” to refer to the .travis.yml file. Also, as you may have noticed, we often refer to Travis CI as just “Travis”.

Pushing images to Docker Hub

OK, now that we have a Docker image that is built by Travis, how do we use it? Well, we want to push the image to a Docker Registry for storage. Users can then pull the image from the registry to the machine where they will run the image.

There are a bunch of different registries one can use, but for now we will focus on one of the most popular ones: Docker Hub.

One of the unfortunate things with Docker Hub is that it has no real API, so it’s necessary to do some of the setup steps manually by clicking through the website. We need two things to be able to push our image to Docker Hub:

  1. A user (with credentials) that can push the image to a repository.
  2. A repository for our new image.

With Docker Hub, you can create a repository under either a user account or under an organization account — it’s kind of similar to GitHub. What we would recommend is setting up an organization account and adding users to teams in the organization. Although this is more complicated, it offers far better access control options for publishing images. The process for setting this up has a few steps:

  1. Create a Docker Hub account of your own.
  2. Once logged-in, click the Organizations button at the top of the page, and create a new organization with the light blue button.
  3. Straight after creating the organization, you should be presented with the teams management page. Click the light blue button to create a new team. Create a team for automated systems. We called ours “automation”. “Robots” is another popular name for this kind of thing.
  4. Log out of Docker Hub. Create a completely new user (unfortunately, this will require a separate email address 😤). This will be the user that pushes images to Docker Hub from Travis. Write down the credentials. Log out again.
  5. Log in as yourself again. Go to Organizations -> your organization -> Teams and add the new user you created to the team you created for automated systems.

Docker Hub has its limitations but it’s essentially free for basically unlimited storage for public Docker images. We do this little dance in order to create a user with limited privileges that is safer to use on an external service like Travis. If you don’t want to do all this then you can just use your personal account credentials. The steps above only need to be done once — now that an organization, automation user, and automation team are set up, they can be reused.

Next, we need to create the repository:

  1. Still logged in as yourself, up in the top right-hand corner, click Create -> Create Repository. Fill out the form. Make sure the namespace you choose is the namespace of the organization you just created.
  2. Yay, you should have a repository. Now, click Collaborators, choose the automation team you created in the box on the right, change the permissions to Write and click the Add Team button.

Moving back to Travis, we now need to set up the steps to push the image to Docker Hub:

First, we need to add the credentials for the automation user we created. We do this using environment variables, and we use an encrypted variable for the password.

Tip: If your password has any symbol characters in it, the command that works well for encrypting the password is travis encrypt 'REGISTRY_PASS="<password>"'

Moving to the before_deploy section, before we “deploy” the image to Docker Hub, we login, using the docker login command. Docker stores the login credentials on disk and at this point should be all set up to connect to Docker Hub.

Next, you’ll notice that we’ve changed the tag for our image from myimageto myorg/myimage. The tag must match the repository name that we are pushing to. For the sake of this example, say we created the repository “myimage” for our organization “myorg” on Docker Hub.

In the deploy section, we use the script provider which just means our deploy step involves running a script. The “script” in this case is just a docker command: we push the image.

And that’s about it. When we push changes to the master branch of our Git repository, Travis will build the changes into the Docker image, and push the image to our Docker Hub repository.

Travis and Docker caching

From the output of the example Travis build you may have noticed something near the end:

Our image just has a useless talking cow, but that added a whole ~33MB to the Debian image. What’s worse, if we change something trivial in the Git repository — even if it doesn’t impact the Dockerfile in any way — Travis will dutifully rebuild a completely new image and push it to Docker Hub. And then our users will have to download that ~33MB all over again.

This isn’t normally a problem when building Docker images on your local machine. Docker caches the layers of images, so if it sees that nothing relevant has changed between two invocations of docker build, then it can use the cache and not create new image layers. With Travis, you get a fresh build environment with every build. Or, to put it another way, the Docker cache is thrown away at the end of every build.

Another advantage to having a build cache is that builds can be much faster.

There are ways to add caching to the Travis build:

  1. A combination of docker save and docker load, storing the saved output to a directory for Travis to cache.
  2. docker build’s --cache-from option, that uses a trusted existing image as the source for a cache.

We will only explore the second solution as it is simpler to use and sufficient in most cases. For more about the first solution read this blog post.

We need to update our Travis file again:

The first thing you’ll notice is probably the new before_script step. Here we pull the existing image from Docker Hub. We add a || true here because the command could fail if the image isn’t in the registry yet.

Finally, we adjust the docker build command to add the --cache-fromoption. Now when we re-build the image, a new image should only be created if something really has changed in our code. (The --cache-from option doesn’t care if we pass it an image tag that doesn’t exist, but it won’t pull the tag itself.)

An extra option we also need to add is --pull. This ensure that we pull the latest version of the base debian image that this image is FROM. Otherwise, we might use an older version that our “cache from” image used.

Here is a build using --cache-from. If the caching is working, you will see a lot of messages like ---> Using cache and the build will be very quick. In this case the image is so simple that the speed gain from caching is not huge, but the docker build step does finish in about one second with caching.

Correct caching is not one of the easier problems in Computer Science, so you need to pick the image that you cache from carefully, particularly if you are versioning your Docker images (a topic we will get to later in this series).

Here are some links to a fully-functional example setup that follows this guide:

That’s it for Part 1 and hopefully it has given you enough to get started. In the next part, we’ll look at different arrangements of Git repositories for Dockerfiles, as well as different build workflows in Travis CI for building images.

Thanks for reading! If you liked this, check out parts 2 (The “Docker repo” pattern) and 3 (Python tools for tagging & testing) of this series.

Written by Jamie Hewland, Service Reliability Engineer

--

--

Reach Digital Health
MobileForGood

We use technology to solve some of the world's largest social problems. Follow our curated magazine MobileForGood. www.praekelt.org.