NOT containers 101: Speeding up your deployments

Abby Fuller
Sep 7, 2018 · 4 min read

I think that hands-down, one of the most common conversations I have with people is something along the lines of “I’ve deployed ${thing}- now I have ${problem}?”

This conversation follows a pretty established path:

  1. I started using ${thing}
  2. I followed <insert blog, tutorial, workshop, documentation, intuition, witchy grimoire here>
  3. I’m unhappy- things did not go as expected/I still don’t have a magical solution for all of my issues 😕

This is totally normal- when we’re first getting started, things like “Getting started with X” and introductory workshops are great resources. But what happens afterward? Let’s look at a couple of common problems and how we can make them just a little bit better. In this post, we’ll talk about how to speed up your deployments through configuring some different settings, and making your Docker images smaller.


“My deployment is slow!”

This is probably the #1 most common! A few possible culprits here.

First up: check your images sizes. The larger the image, the slower the deploy, since more time is spent pushing and pulling from the registry. You can check your image size locally by running:

$ docker images

This should return something like:

REPOSITORY              TAG                 IMAGE ID            CREATED             SIZEamazonlinux             2.0.20180622.1      585cc50169e6        2 months ago        163MBpython                  latest              a5b7afcfdcc8        3 months ago        912MBtensorflow-base         1.6.0-cpu-py2       30ad61eefa75        3 months ago        1.73GBgolang                  latest              3f30f1fc3c43        3 months ago        794MBubuntu                  16.04               5e8b97a2a082        3 months ago        114MBdebian                  latest              8626492fecd3        4 months ago        101MBalpine                  latest              3fd9065eaf02        7 months ago        4.15MB

See the column under SIZE? Those are your image sizes. So what can you do if it turns out your images are pretty big? TL;DR: image sizes are determined (mostly) by the number of layers your image has, and how big those layers are. Image size is worth an entire post on its own (I talk about it a lot), but there are a few initial tips:

  • Use shared base images (or a smaller base image) wherever possible
  • Limit the data written to the container layer
  • Chain RUN statements
  • Prevent cache misses at build for as long as possible

Most important for Docker image size is understanding the cache: starting from the parent instruction, Docker will look at each instruction to see if it matches the cached version. If you’re working with files in your instruction, only ADD and COPY will compare the file checksums. For every other instruction, only the string of the command is compared. If the string is unchanged, it is considered a match, and the cached layer is used. Once the cache is broken, every subsequent layer is built again.

The larger the image, the longer it will take to push and pull from the registry (and build locally), and the slower your deploys will be.

Once you’ve looked at your image sizes, check your EC2 level settings. Let’s talk about “secret”* Autoscaling Group (ASG) settings

*not actually secret, just harder/less intuitive to find ¯\_(ツ)_/¯

These settings impact a) how quickly your ASG can scale, and b) how quickly it will mark your containers healthy or unhealthy:

Default Cooldown: how long the autoscaling group will wait before evaluating the rule again

Health Check Grace Period: amount of time, in seconds, that the autoscaling group will wait before health checking the service.

Since containers run as part of services managed by ECS/Fargate/EKS are backed by these ASGs, changing these settings can affect how long it takes your service to scale, and how quickly containers pass or fail healthchecks. The defaults here are pretty generous: in general, you don’t need to wait 300 seconds for your container process to be up and running. The faster your process starts up, the smaller you can make this window. Changing it to something like 10 seconds will significantly speed up your deployment (and if you need 300 seconds to start your process, consider rethinking that).

You also don’t need to wait 300 seconds after evaluating the ASG rules before re-evaluating- this will make your scaling response time pretty slow. Consider something like 30 seconds. For both of these values, you can continue to tweak and re-adjust as you figure out the balance that works for you.

You can also affect health check settings at the load balancer level.

Advanced health check settings, yo.

Healthy and unhealthy thresholds are determined at the load balancer level (in this screenshot, we’re looking at an Application Load Balancer (ALB)).

Healthy threshold: how many times your container must pass the load balancer health check before being called healthy

Unhealthy threshold: how many times your load balancer will allow your container to fail the health check before marking it unhealthy.

The shorter you make these intervals, the faster your deployment will be. Be careful, though- you have to give your service enough time to actually pass.


Got questions, comments, or a way to speed up deployments that I missed? Let me know here, or on Twitter: I’m @abbyfuller 👋

Abby Fuller

Written by

developer relations, agony aunt at Amazon Web Services. Formerly of Airtime, Hailo. tweets @abbyfuller

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade