Automating Deployment to Google App Engine with Docker and Travis

This post goes into detail about the continuous deployment pipeline that we’ve built for Hollowverse. I hope it will be useful for anyone looking to automate their deployment process on Google App Engine.

Sometimes it does not go as intended. (https://xkcd.com/1319/)

We try to leverage Docker for everything at Hollowverse — whether on CI or for running the actual app on App Engine. Docker provides a secure, isolated and portable environment for building and running code.

Here are some of the benefits of using Docker:

  • Consistency: We ensure that the app is running in a consistent environment across machines. This is particularly important for large projects because it helps ensure that your code does not depend on something that just happens to be installed on your development machine and that all the dependencies in your project are available when your code runs — nothing less and nothing more.
  • Flexibility: We can control and customize the entire environment with Docker, even the operating system.
  • Portability: Since Docker containers are cross-platform and portable, and because most service providers support Docker images, we can migrate our projects or our CI/CD process to another provider if need arises without much effort.

This post is not intended as an introduction into Docker. If you are new to Docker, there is plenty of excellent resources out there to help you get started.

Building the environment image

For our deployment process, we use something we call an environment image. It’s a Docker image that installs all the dependencies required to deploy our app to App Engine, this includes the Google Cloud SDK, which provides a command line utility (gcloud) that performs the actual deployment.

The environment image does not contain the actual code of the project. It just provides the environment that enables you to perform deployment commands that you would otherwise do on your local machine (which is probably not a good idea, unless it is a small personal project).

So you might now be wondering:

If the environment image does not contain the project code, how is the project deployed from a container running this image?

This is where Docker volumes come into play: we mount the directory containing our source code as a volume inside the container. The idea should sound familiar to anyone who has some experience with the way Unix file systems work. Basically, when you run a Docker container, you tell Docker to use an external directory (one from your machine) as a directory inside the container. Any files in this directory will be visible to code running inside the container.

So there are now two things we need for deployment: the source code and the environment image. On every new commit to our GitHub repository, Travis is configured to:

  1. clone our repository,
  2. pull the environment image from Docker Hub, and
  3. run a container based on the environment image, and mount the project code as a volume inside this container at a specified mount point. The command for running a Docker container with a volume looks like this:
docker run -v $(pwd):/code-directory <docker image name>

This runs the specified image with the current working directory ($(pwd)), which contains the source code, mounted at /code-directory inside the container. The directory will be mounted with read and write access by default, so you can move and delete files inside the container.

With the code available inside the container, Docker executes the deploy command provided by Google Cloud SDK, which is something similar to this (assuming /code-directory is where the project code was mounted):

gcloud app deploy /code-directory/app.yaml --project <project id>

Note: You can find the project ID in the your project’s dashboard on Google Cloud Platform.

The command exits with an exit code that indicates failure or success, and the container exits with the same code. Travis uses this code to know whether the deployment failed or succeeded.

Storing App Engine credentials securely

I’ve intentionally ignored this part in the previous section just to make it easier to understand how the deployment flow works. But if you want to actually deploy your app to App Engine, you will have to authenticate gcloud first.

If we were deploying code from a local machine, authentication would be very simple, we would just need to run:

gcloud auth login

This opens a new browser tab that asks us if we want to allow gcloud to access Google Cloud Platform on our behalf. Once this is done, we can return to the terminal and execute the deploy command.

But things are a little different on the CI:

  1. First of all, all of the commands must run non-interactively — you cannot interact with a browser or enter passwords.
  2. Second, even if there is a way to pass your password non-interactively, you probably do not want to use your Google credentials on CI. If these credentials get leaked somehow, anyone would be able to access your email and run their code on Google Cloud Platform with you paying the bills. It does not sound like a good idea to me.

So first thing you should do to properly secure your deployment process is create a service account.

Service accounts are accounts you create to do these kinds of things. They are not actual Google accounts, i.e. they do not belong to a person, and they can’t do anything outside the project they are created in. They can’t even do anything unless you give them explicit permission. This provides some safety in case the service account credentials get leaked. This is not to say you shouldn’t protect them like any other credentials, but if you are careful about what permissions you grant to the service account, a leak of the service account credentials won’t be as disastrous as leaking your own Google password. You can even revoke these credentials any time or change the permissions granted to the service account.

So, a service account sounds great for our use case.

While not required, I recommend you also create a role. A role in Google Cloud Platform defines a group of permissions you can assign to any account in your project, including service accounts. You can also assign multiple roles to one account.

While you can use one of the predefined roles for the new service account, like “App Engine Deployer”, I recommend you create a role specifically for automated deployment purposes. This way you can assign granular permissions to the role and even remove and add permissions as needed if you decide to update your deployment tasks.

You can create a role in the “Roles” section of the “IAM & Admin” page.

Creating a role in Google Cloud Platform

Now go to the “Service accounts” section and create a new service account with the new role, the role will be listed under the “Custom” category of the Role dropdown menu.

Creating a service account in Google Cloud Platform

Now we have a service account, which has the permissions required to deploy our app from CI. We need to authenticate gcloud with this account. The authentication command for service account is different from the one for regular accounts. It looks like this:

gcloud auth activate-service-account <service account id> --key-file <path to key file>

So we need two things to authenticate:

  1. The service account ID (which looks like <name>@<project-id>.iam.gserviceaccount.com)
  2. One of the keys associated with this account, in JSON format.

You probably already have a JSON key file generated for you when you first created the service account. If you chose not to create a key when you created the account:

  1. Go back to the “Service Accounts” section of the “IAM & Admin” page.
  2. Click on the options button (the three dots) next to the service account, and choose “Create key”.
  3. Select “JSON” as the key type and click “Create”.

The key file will be downloaded to your machine. Copy the file to your local repository (but do not commit it, I will explain why shortly) and run the authentication command above to test if things are set up correctly. Let’s assume the file is named key-file.json.

If you are curious enough, you’ve probably taken a look at the contents of this JSON file. There is a field called “private_key”. So by now you realize that this file should not be shared with anyone not intended to deploy code.

If your code is public, like in our case, you need to protect this key. You can’t just put it on GitHub in plain text. If you do, anyone would be able to get this key and use it to deploy to your App Engine project.

So, this is why you need to encrypt the key file.

But if the key file is encrypted, how would I be able to authenticate gcloud on the CI?

Good question! When you encrypt the file, you use an encryption key to do the encryption.

You store the components of this encryption key as secure values in Travis settings. Travis will provide these components as environment variables when running the build. Now before you authenticate, you read the key components and use openssl to decrypt the key file.

Okay, this might sound confusing, but it is actually very straightforward. In fact, Travis provides a CLI utility that does all the heavy lifting for you, you just need to run the following commands:

travis login
travis encrypt-file key-file.json

That’s it! This will take care of encrypting the file and storing the encryption key in Travis settings. You can confirm this by going to your project settings in Travis and looking under “Environment Variables”:

When running the build, Travis will ensure that these values are kept hidden from the public logs. They will show as [secure] so you don’t have to worry about those being leaked in the build logs.

In addition to storing the key components, Travis will output the encrypted file as key-file.json.enc.

Make sure you delete the plain-text version and commit the encrypted file to your repository. Never commit the plain-text version. If you do commit it by mistake, revoke the key by clicking the trash icon next to the service account in Google Cloud Platform and create a new one.

The Travis CLI tool will also tell you to update you .travis.yml file to include the decryption command, which looks like this:

openssl aes-256-cbc -K $encrypted_744738cd0ff8_key -iv $encrypted_744738cd0ff8_iv -in key-file.json.enc -out key-file.json -d

Let’s break down this command:

  1. aes-256-cbc is the symmetric encryption specification, the key length, and the mode of operation. These are the components of the encryption algorithm Travis uses to encrypt the file. I’m not a cryptographer, but I know from my readings that this is good crypto.
  2. -K stands for key, and its value is the secure value encrypted_744738cd0ff8_key stored in Travis settings.
  3. -IV is the initialization vector of the encryption. It is a value used to randomize the encryption so we get different output each time the same block of data is encrypted. It is not a secret, but it has to be stored in order to be able decrypt the data.
  4. -in is the encrypted file path.
  5. -out is the path where the decrypted file will be saved.
  6. -d indicates that this is a decryption operation.

While you can add this command to the before_script hook that Travis provides, we do things a little differently here: instead of decrypting the key file outside the container, we do it inside the container. Since we use an environment image, it only makes sense to do everything required to deploy inside the container. This also reduces our dependency on proprietary features of CI providers and makes migration to another provider easier.

So we add the decryption command to the CMD instruction of the Dockerfile (we also install openssl):

You can find the entire Dockerfile on GitHub.

We also need to update .travis.yml in our repository to pass the key components as environment variables to the container:

Here is the entire file:

I’m going to cover the BRANCH variable in another post.

One last thing: you may have noticed that we check to see if the build is for a PR first and skip the docker run command if it is. The build would fail for PRs without this check, because Travis does not pass our secure environment variables to PR builds. If it did, anyone would be able to send us a PR that reads these secure values and sends them to a server they control. However, these values will be available for regular builds, so if we merge the PR, it will be deployed successfully.

This makes sense, but it does come with a downside, because Travis will show a green check mark next to every PR on GitHub, since there is nothing to do for PRs. But this only affects deployment. If you have tests to run, you can still run them in both cases.

That’s it! We now have a secure, Docker-based continuous deployment pipeline for our App Engine project!

But why stop there when we can automate even more things?

Continuous deployment of the environment image

We’ve also set up continuous deployment for our environment image so that every time we update the Dockerfile of that image, it gets published to Docker Hub via Travis.

So yeah, we continuously deploy our continuous deployment environment.

Setting up this was straightforward, since we already had our environment image in a dedicated repository, all we had to do is add .travis.yml file to the repository, and tell Travis to start watching this repository.

In order to be able push the image to Docker Hub, we created a Docker Hub account specifically for deployment, and stored its credentials in Travis settings for the image repository.

When building the image, it has to be tagged as <org-name>/<repo-name> — this is what the -t flag is for. These refer to the organization name and the repository name on Docker Hub, which do not have to match the ones on GitHub.

If you look at the .travis.yml file in our main code repository, you will notice that this is the image that is used when running the container (line 13):

Simplifying the Dockerfile of the environment image

With the basic functionality working, we can further enhance and simplify our Dockerfile. This will also serve as chance to test our automated deployment process for the environment image.

So far, the deploy command looks like this:

While this does the job for now, you can imagine that as the project grows, we may need to add more commands and having a bunch of \ and && thrown around is probably not the best way to go about this.

So we decided to extract these commands to a separate deploy script. Since the deploy logic will probably only get more complicated over time, and because we are familiar with JavaScript, we chose to write all our automation scripts in JavaScript for maintainability.

Here is the deploy.js file:

Problems with Google App Engine

You may have noticed that we retry deployment multiple times in the script above. This is because we had some issues with Google App Engine where deployments would sometimes fail for no apparent reason. The deploy command would build the project runtime and continue to update the service, but it would just fail with an “internal error”. Restarting the Travis build a couple of times resolved the issue without having to change our code, which indicated that something probably went wrong on Google’s side.

Regardless of the source of the issue, we decided to retry deployment in the same build a couple of times before giving up, so that we can be a little more confident that a build failure is indeed a problem in our code.

Conclusion

Automating code deployment process is definitely a good idea for any serious project because it lets developers spend their time on the actual development and automates away the tedious parts of building and deploying the project. If you are willing to put some effort, you can get the latest version of your App Engine project automatically available to end users within a few minutes after you push code.