Dev Chat: A brief intro to Containers

A look at Message.io’s approach to Atlassian BitBucket Pipelines, Docker, Jenkins & tips from implementing CI/CD internally.

By Ian Seyer, DevOps Engineer @ Message.io

Containerization is changing the world.

As a developer, containerization provides distinct benefits for mission critical projects:

  • Writing new software
  • Migrating applications to new hardware
  • Scaling your infrastructure
  • Evaluating new services

In this article, I’ll cover how our team went from bimonthly releases to continuous deployment leveraging Atlassian’s new Bitbucket Pipelines. In addition, we’ll cover moving from manually testing to automating our testing process.

Atlassian BitBucket Pipelines

Assumptions

In order to get the most from this post, we’ll assume you’ve read up on the following concepts:

  1. Docker, RKT, and similar container frameworks.
  2. Continuous integration & continuous deployments (CI/CD)
  3. Testing frameworks
  4. A basic understand of hosting applications on a server

The Old Way

If you’re a DevOps professional, you’re likely familiar with some of the following CI/CD tools:

If you are implementing these frameworks, you know that every tool has its limitations. At Message.io, we run Jenkins and the majority of our core system is built in PHP, with several microservices written in various languages.

Exploring Jenkins deployment workflows:

While every development team runs a little different, your process is probably along the lines of pulling code, installing dependencies, and pushing the result to a server somewhere.

Here’s a common scenario most dev teams handle during releases:

  1. Your team of developers chomps at the bit for a feature or a hotfix, and have merged it into your development branch
  2. Your git host sends a webhook notification to Jenkins to say “hey, somebody pushed to develop!”
  3. Jenkins runs all the steps required to bundle the application’s dependencies including fetching the source code, npm/pip/go get/compose installing, shuffling folders into the right directory, and copying the resulting contents of this process to your target server.
Truth is, there’s a lot to unpack in step 3.

Let’s be honest, many teams make the following assumptions:

  1. Your application works with any version of most application-level dependencies
  2. Your server will have the right compilers/runtimes/server/database already in place, ready to receive code (you spent so much time crafting your httpd.conf didn't you?)
  3. Testing can happen once you deploy your code to the dev/testing/staging server
  4. The right people have the right access to the right server
  5. Prod is configured just like your dev environment
These are dangerous assumptions!

Here’s a few reasons to rethink your team release process:

  1. Target servers can be dead in the water without you knowing! Short of having extensive health checks running, you can’t be sure that the target server will be in a healthy state and ready to receive code.
  2. Your application could be upgrading a random dependency that has introduced breaking changes. This is bad news for obvious reasons.
  3. If you are properly treating servers like cattle and not like pets, you shouldn’t have to make any assumptions about what the target server is. When you push your code to it, the server should know exactly how to handle it. Whether it’s a web server, an Elasticsearch cluster, or a PyTorch training net, the server should be configured upon code arrival.
  4. It’s not repeatable! Try taking your Jenkins job and having it deploy to 100 servers without any manual configuration.
  5. Having configuration be stored outside of the application runtime and setup distinctly every time leads to a widening gap between your development and production environments. This is bad practice, as it quickly leads to more friction between development and deployment. In fact…
  6. this widening gap also means your tests are out of sync with your environments!

While I hope these are obvious, if your tests no longer …test things… that is no bueno.

Okay! That list is getting a little too big for me, so we’ll head back to the main takeaway:

Containerize your application already or, Introducing Bitbucket Pipelines

Odds are this very conversation has probably been discussed in engineering planning meetings and a lot of people moan and groan about how it’s going to take too long and won’t be worth it:

Remember these are basic lies we tell ourselves. Have a deployment plan and build safeguards for your team!

Typically, we tend to use single function services at Message.io. The exception is Atlassian’s approach: They provide a massive suite of products for entire companies to run on.

Specifically, Atlassian Pipelines is focused on maximizing the time-spent-implementing to time-saved-using ratio. Remember, time is money. 💸

Side note: Try out Drone.io

When I was propping up build systems for several various side-projects, I turned to drone.io, a drop-dead-simple CI/CD tool that had an awesome feature: the entire pipeline is a single .yml file and every step of the build process occurred within a Docker container.

Drone.io

This not only meant that the build process itself was more efficient (parallel builds on a single machine!?!), but it meant that the final result of the build process was a compiled Docker image.

Pipelines adopts a similar philosophy: the entire build process occurs within a Docker container (and you can even access the Docker daemon for extra features). These are just the beginning of what these tools offer for developers to work smarter and faster with confidence.

What this means for your process

  1. You can Dockerize your entire application with a short .yml file.
  2. You can provide your own base image, i.e. a linux installation (with a flavor of your choosing, I recommend alpine wherever possible) with all the right dependencies and configurations in place. More on this later.
  3. Executing your tests for every push to a branch of your choosing is drop dead simple (now getting developers to write them? Not even Atlassian can solve this.)
  4. Because the execution context of the application is the base image, it is static, predictable, and repeatable.
  5. You can deploy your application to literally any server that can run Docker, automatically. (maybe cut down on AWS costs by making a super computer out of Raspberry Pi? Really though.)
  6. You can leverage the top tools today — Kubernetes, Rancher, DC/OS, CodeDeploy, etc. Your application has just made huge strides on the path to true scale.
  7. You can now start to chunk out your monolithic application into manageable and efficient microservices.

Making your own Base Image

This next step is optional, as Bitbucket provides several starting images here, but rolling your own image has several benefits including requirements snapshots, image size, and configuration. If you don’t want to do this, feel free to skip ahead.

Think about it — your application needs a few things in order for it to work. Your application doesn’t exist in a vacuum, rather it lives in an execution context.

In order to make a base image for your application, you need to ask yourself this question:

  • What does your software need in order to function?

Once you have a good understanding of your requirements, write a Dockerfile to capture them. I’m not going to teach you how, but other people will.

Reference: Our PHP-stack Dockerfile base image:

FROM amazonlinux:2017.03
RUN yum install -y php7.0 php7-pear.noarch php70-json.x86_64 php70-mbstring.x86_64 \
gcc openssl openssl-devel git
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/$
RUN pecl7 install mongodb igbinary memcache
RUN echo "extension=mongodb.so" >> /etc/php-7.0.d/php.ini
RUN curl -O https://bootstrap.pypa.io/get-pip.py
RUN python get-pip.py --user
RUN ~/.local/bin/pip install awscli --upgrade

For brevity, here’s a broad-stroke summary:

  1. We use AWS at the moment, so we’re going to want to use Amazon Linux
  2. install php, gcc, openssl, and git
  3. install PHP composer (a PHP package management tool)
  4. install our PHP-level dependencies
  5. install the AWS CLI.

Note: This is just your base image. It has no knowledge of code and application-level dependencies. That’s why we aren’t doing anything like composer install just yet.

That’s it! That’s our base image. It contains everything we need to run our code’s tests (which are testing everything your application does, right?).

Note: we have not set up any kind of Apache or Nginx server; we will save that step for the Deployment article.

There are tons of things you can do here to customize your image, but once again, that’s out of scope here.

Docker Build

The next step is to build your docker image (docker build), tag it, and push it to a registry (which is basically github, but for docker images). If you're using AWS, they are drop-dead simple to set up via ECS. If not, you can run your own. I've also had good luck with Artifactory.

Once you’ve done that, it’s time to build your Pipeline.

For posterity and later reference, here’s the testing portion of our pipeline:

image:
name: $DOCKER_REGISTRY/base:latest
aws:
access-key: $AWS_ACCESS_KEY_ID
secret-key: $AWS_SECRET_ACCESS_KEY
pipelines:
default:
- step:
caches:
- composer
script:
- cd thecode/
- composer install
- vendor/bin/codecept run
- aws ecr get-login --no-include-email --region <region> | bash
- docker build -t $DOCKER_REGISTRY/build:latest .
- docker push $DOCKER_REGISTRY/build:latest
services:
- mysql
- mongo
definitions:
services:
mysql:
image: mariadb:latest
environment:
MYSQL_ROOT_PASSWORD: test
MYSQL_DATABASE: test
MYSQL_USER: test
MYSQL_PASSWORD: test
mongo:
image: mongo
options:
docker: true

Here’s a breakdown of what this does:

Pipelines are a docker container and therefore have to have a Dockerized place to start (This is your base image.) In our instance, we’re going to kick off the build by pulling down our monorepo:base image. All future steps in the pipeline will be executed inside of that image.

Each line of the yml file inside the script block is executed via /bin/bash in the docker container. It's like being SSH'd right into your machine. No DSLs to learn, no cludgy interfaces to deal with. Just nice, crisp text. (Also, you get credentials management out of the box via the familiarity of Environment Variables).

You’ll notice that we cd right into our code. This is because pipelines has already pulled our latest commit into the WORKDIR of the Dockerfile. Neat, huh?

We then install our own internal library; then install this specific application’s code (api-manager).

Testing with Codeception

Then we test (we use codeception). That’s right, we test. That easily. Every. time. someone. commits. Now, if these tests fail, codeption will write to STDERR and the build will be considered a failure. Because of how it's configured in Bitbucket, that failure notice triggers a Slack notification so that everybody knows what failed and when. It even provides a link to the relevant pipeline commit page for further debugging.

We’ll pause here since all your tests are automated and you have just containerized your application.

I’m going to cover more about deploying this container in another article later down the road, but I wanted to write this up as a quick and easy way to, if not to promote adoption of Bitbucket Pipelines directly, show that there are tools out there that facilitate the containerization of even the most complex of applications, and the excuses to not do so are waning.

As a “sneak peek” of sorts, you can probably guess what is happening after that vendor/bin/codecept run command. (Note: that options block at the bottom is important).

Note: using sidekick containers

You will notice that there are some bits in the bitbucket-pipelines.yml file I didn't discuss, namely the definitions block.

While they are short, they are quite powerful: they give the pipelines execution environment access to a MariaDB and MongoDB instance for testing purposes. Right off the bat. No networking to deal with, they’re sitting right there on localhost just like you'd hope. All I have to do is define them, and link them to the main environment via the services block.

You can spawn any prebuilt image you can find (or your own pre-seeded database, for example!) alongside your container. These are commonly called “sidekick” containers.

Deployment guide from BitBucket

If you don’t want to wait around for Part 2 of this article, and want to go ahead and sink your teeth in, Bitbucket has your back here. They provide tons of pre-made pipelines for deploying to AWS, GCE, Heroku, Kubernetes, and Azure.

Wrap Up

Containerization is a powerful way to scale and handle modern software development. Building without a CI/CD process is increasingly antiquated. Have a plan and you’ll be removing the common bad surprises that come with different tech stacks for various systems.

There are lots of ways to reach the same goal (Jenkins has Docker plugins, Drone.io is awesome, etc), and this isn’t meant to be the end-all be-all definition. I’m just glad that Atlassian, a huge developer company, has created a tool that makes it dead easy to leverage some very modern technology. High tide raises all ships.

Have a topic you’re interested in? Let us know on Twitter @message_io

About Ian Seyer

Ian Seyer is a DevOps engineer at Message.io.

Based in Austin, he focused on making software development and deployment safe, repeatable, and efficient.

Contact: You can find Ian on Linkedin and Twitter

Like what you read? Give M.io a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.