Lean Golang Docker Images Using Multi-Stage Builds

Image for post
Image for post

Deploying and running an application using Docker is something that can be done in seconds: in our case, just grab an image, push it to the repository and run it on AWS ECS. However, this simplicity may result in an image that is easily hundreds of megabytes large.

At TourRadar, we have historically been writing our applications mostly in PHP. Since March this year, however, we introduced Golang to our stack, given a few good use cases for it. And at first, we used the same approach to building Docker images as for PHP — just take the official image (alpine or stretch).

As a practical example, let’s quickly develop a simple web server in Golang and dockerize it.

ARG GO_VERSION=1.12
FROM golang:${GO_VERSION}

COPY . ./var/app
WORKDIR ./var/app

ENV APP_BUILD_NAME="main"
ENV GO111MODULE
="on" \
GOOS=linux
RUN go build -mod vendor -o ${APP_BUILD_NAME} main.go
RUN chmod +x ${APP_BUILD_NAME}

EXPOSE 3000
ENTRYPOINT ["/main"]
CMD ""
Image for post
Image for post

That’s 833 MB. Let’s now try the golang-alpine image. Alpine is a minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size.

Image for post
Image for post

369 MB still doesn’t feel small enough for a simple web server. How can a simple web app written in Golang be that big? We can certainly do better. At most, each container should contain the application code, language-specific dependencies, OS dependencies, and that’s it. Any more is a waste and even a potential security hazard.

One of the cool things about Golang is that it is a statically compiled language, which means we can create a binary and just run it without any dependencies. This gives us room for optimization, and to have only a minimal set of tools to run a binary.

Image for post
Image for post
Photo by You X Ventures on Unsplash

Enter Multi-stage Docker Builds

Thanks to Docker multi-stage builds (supported from Docker 17.05), we can build tiny Docker images with only a binary inside. Also known as the builder pattern, it involves using two Docker images: one to perform a build and another to ship the results of the first build without the penalty of the build-chain and tooling in the first image. It helps us keep our configuration DRY by using artifacts from one image to another.

Our application will be a binary in a Docker Scratch image (think docker image with nothing inside it). You cannot pull this image, but you can refer to it in your Dockerfile.

Let’s create a new Dockerfile, which then can be used for docker-compose to have local and production builds.

1. The first stage will be “dev” as it will prepare the container to run an application by changing the entry point in docker-compose.yml.

ARG GO_VERSION=1.12
FROM golang:${GO_VERSION}-alpine AS dev

ENV APP_NAME
="main" \
APP_PATH="/var/app" \
APP_PORT=3000

ENV APP_BUILD_NAME="${APP_NAME}"

COPY
. ${APP_PATH}
WORKDIR ${APP_PATH}

ENV GO111MODULE="on" \
CGO_ENABLED=0 \
GOOS=linux \
GOFLAGS="-mod=vendor"

EXPOSE
${APP_PORT}
ENTRYPOINT ["sh"]

In this stage, we are just pulling the Golang Alpine image and preparing the environment to build a binary.

Here with CGO_ENABLED=0 we are disabling cgo in order to build golang application statically. This means we will include all the dependencies once you copy this binary to the image.

2. In the second stage (“build”), we create a production-ready binary of an application. We use Go 1.11 modules for dependency management, that’s why we add ‘-mod vendor’ flag.

FROM dev as build

RUN
(([ ! -d "${APP_PATH}/vendor" ] && go mod download && go mod vendor) || true)
RUN
go build -ldflags="-s -w" -mod vendor -o ${APP_BUILD_NAME} main.go
RUN chmod +x ${APP_BUILD_NAME}

3. In the third stage (“prod”), we put a binary in the Docker Scratch image and prepare it to run on the production environment. In Docker, you can copy some specific files from one stage to another by using COPY — from=.

FROM scratch AS prod

ENV APP_BUILD_PATH
="/var/app" \
APP_BUILD_NAME="main"
WORKDIR
${APP_BUILD_PATH}
COPY --from=build ${APP_BUILD_PATH}/${APP_BUILD_NAME} ${APP_BUILD_PATH}/

EXPOSE
${APP_PORT}
ENTRYPOINT ["/var/app/main"]
CMD ""

4. Finally, we can build this Dockerfile and look at the size of our image after the changes.

docker build ./ -t my-go-app-golang
Image for post
Image for post

For our small web server the entire image, including the OS layer and the compiled binary, is 5.34 MB in size. That’s two orders of magnitude smaller than we began with.

5. Finally, let’s prepare docker-compose.yml for it to run local and production builds. When you develop some application in a local environment and you want to use hot reloading, you need to have a dev and build stage, but we already have that.

When you create a REST API using Gin framework, for example and run it with https://github.com/codegangsta/gin, it will watch for file changes and rebuild the binary automatically for you.

But how should docker-compose look like for such applications? Let’s look at that.

version: '3.5'
services
:
app_dev:
volumes:
- .:/vap/app:delegated
build:
context: ./
dockerfile: Dockerfile
target: dev
command: "scripts/start-dev.sh"
environment
:
APP_PORT: 8282
ports:
- "8282:8282"

app_prod
:
environment:
APP_PORT: 8283
build:
context: ./
dockerfile: Dockerfile
target: prod
ports:
- "8283:8283"

There are a few things to highlight here:

  • We need at least ‘v3.4’ to use the multi-stage features.
  • Each sub-image docker image is mapped to its own service (dev and prod) using docker’s ‘target’ flag
  • You can use YAML Anchors to prevent code duplication across services.
  • We use volumes for local environment to sync our code with the container.
  • In scripts/start-dev.sh, we have a command to start the Gin framework hot-reloader.

With the Dockerfile and docker-compose.yml set up, we can then work with the 3 images easily:

docker-compose builddocker-compose run app_devdocker-compose run app_prod

As an aside, Golang isn’t the only language that can benefit from using one base image to build assets and a second image to run them. We leverage the same builder pattern approach for our Python images, for instance in our Machine Learning applications. Using python-alpine as a base image and pre-installing all dependencies on the builder stage:

RUN pip install - install-option=" - prefix=/install" -r /requirements.txt

We can copy those dependencies for the production image. As a result, that stage will be cached and deployment will be done faster. Check this cool article for more on this.

In Closing

At first, using an official Docker image for an application is a very good choice. The Docker build system allows us to create images that are very large if written naively but also small, lightweight, and cacheable if done correctly.

With a true DevOps mindset, trying to make your Docker images smaller is a natural progression. Going further, the next step for us will be switching to an Alpine image or other small images like busybox. And if you are able to convert your project into a binary, using Docker Scratch image is a good idea.

Using the process detailed above, we have successfully leveraged Docker’s multi-stage builds for our Golang applications. That helped us decrease image size and speed up build process by caching whole stages. All our application images are now less than 12 MB, as you can see below in our AWS Elastic Container Repository:

Image for post
Image for post

And what are the real benefits, you ask? We can think of a few important ones:

  • Fewer bytes to send over the network and store on disk.
  • Faster to build and deploy the container, accelerating our CI/CD pipelines.
  • More cost-effective to store images.
  • Cleaner and more secure: no useless stuff that can be exploited.

Going forward, in the good spirit of Agile, we will keep iterating on this, and constantly push for more efficiency, reliability, and security across our stack. We would love to compare notes, so do let us know in the comments about your own experience with building containerized applications. Meanwhile, there’s plenty of great resources out there that have helped us out:

TourRadar

TourRadar’s the world’s largest online travel agency for…

Alexander Yaremchuk

Written by

Senior Software Engineer / TourRadar

TourRadar

TourRadar

TourRadar’s the world’s largest online travel agency for multi-day tours. Housing over 2,000 tour operators, we offer more than 40,000 tours in 200 countries.

Alexander Yaremchuk

Written by

Senior Software Engineer / TourRadar

TourRadar

TourRadar

TourRadar’s the world’s largest online travel agency for multi-day tours. Housing over 2,000 tour operators, we offer more than 40,000 tours in 200 countries.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store