Make AWS CodePipeline 20x faster and 10x cheaper using EFS with AWS Codebuild.

Paul Carr

Published in

Investing in Tech

6 min readFeb 17, 2020

Introduction

The default out of the box Codepipeline config (as used by most people) consists of a source stage, build stage, and deploy stage, or several deploy stages depending how many environments your pipeline spans.

Codepipeline by default uses S3 as an artifact repository. Once Codebuild has zipped up the completed build, it pushes it into S3, making the artifact available to all the downstream codedeploy jobs.

The Problem

This zipping and uploading to S3 takes a lot of time and involves pushing a large chunk of data through your NAT Gateway, which you get billed for.

This isn’t so bad if you build infrequently, but if you're trying to use CI/CD it gets very expensive very quickly. Not only have you got a team of developers sat around twiddling their thumbs waiting for the pipeline, but you’re also sending many gigs of data through the NAT Gateway. Sometime later, the CodeDeploy jobs then have to pull this build back down from S3, unzip it, and copy the files to the right place … all very expensive and time consuming.

You can set up VPC endpoints to S3 to avoid part of the cost, but you’ll still have a really slow pipeline.

How bad is it?

For example, I discovered that one of our pipelines, a Gatsby react app, was pushing nearly 600M of data for every build. Making matters worse, this pipeline could be triggered any time a content editor decided to modify a page or add a new image, meaning the pipeline could run up to 40 times a day. This equated to $1500 a month in NAT Gateway costs alone, never mind the lost man-hours.

The EFS Solution

AWS announced last week that Codebuild can now mount EFS drives. I think their intention was for caching maven / node repos so you didn’t need to re-download ALL dependencies for each build.

However, because the same EFS drive can also be mounted on any other instance downstream, it means that from the Codebuild stage onwards, all boxes can access the same disk.

To mount a drive for use with Codebuild you just edit the Codebuild environment and scroll to the bottom of the advanced configuration:

So, let’s say that you have four Codedeploy stages in the pipeline. DEV, QA, STAGING, and PROD. The EFS mount that Codebuild mounts can also be mounted to all these downstream instances.

By simply changing the buildspec.yml to store the completed build on the EFS drive, and making the ZIP artifact out of the codebuild job contain only a set of deploy scripts and the BUILD_ID, each codedeploy only has to pull down a <1K zip file from S3, and the scripts within could simply change a symlink on the target machine to point to the new BUILD_ID folder.

How much do you save?

Using this approach I changed the aforementioned Gatsby app’s Codedeploy stage duration from over 4 minutes to 12 seconds!

If you have 4 codedeploy stages, that will reduce the total deploy duration of your pipeline from 16 minutes to 48 seconds

Also the $1500 per month NAT Gateway bill is now just $48.

What's the catch

There are a few slight downsides. To rollback to a previous build in a traditional S3 based pipeline, you would simply open the CodeDeploy page, find the deployment you wanted to rollback to, copy it, and click “Copy Deployment”

For the same process to work on EFS you would need to store ALL your historic builds on the EFS drive….. but your EFS bill will gradually get bigger and bigger, so it’s not sustainable.

To solve this, you can use AWS Backup to do incremental backups of your EFS drive on a cron schedule. That way you can housekeep the EFS drive leaving say 1 weeks worth of builds on there. In our Gatsby app example above, 5 days at 40 builds a day would come to 120Gb ! To store 120GB on EFS standard access will cost about $40 per month.

In the extremely rare situation you need to rollback to a build over a week old, restore the EFS drive to the correct point in time, Then do the CodeDeploy rollback as per normal.

The second, and more disturbing downside, is that having a single EFS drive mounted on all your highly available EC2 instances weakens your DR capability. In other words, it’s a single point of failure… if the EFS drive goes bang, ALL your instances fail, defeating the multiple Region / AZ architecture you’ve worked so hard to accomplish.

So, a hybrid model may be more to your liking. The EC2 instances serving your apps could copy the build folder from the EFS mount onto its local EBS drive. Any failure during this process would cause the Codedeploy stage to fail, and you’ll be left with a working site pointing at the previous build.

This hybrid approach will never be as fast as simply switching a symlink, but you will still save all the NAT Gateway expense , plus you’ll save the download & unzipping time from S3.

The third downside is the security aspect. Using a proper artifact repo such as S3 / nexus ensures that nobody has tinkered with the build since it was created. IAM can be configured so that only Codebuild can write to the bucket, and only CodeDeploy can read from it. HOWEVER, we can achieve the same result by making codebuild MD5 checksum the build folder, and store this in the build metadata in S3. Then in the Codedeploy stage we simply ensure the checksum matches before deploying.

find buildDir -type f -exec md5sum {} \;  > buildDir.md5

Conclusion

The purist may argue that you should use a “proper” secure artifact repository such as nexus, artifactory, or even S3 (as codepipeline does), but the main purpose of an artifact repo is to allow you to pull a versioned build on demand. Using the EFS method described above is still using S3 as a repo for every historic builds metadata, but the bulky builds themselves are stored on an EFS drive. By using AWS backup to create an incremental snapshot at regular intervals you can ensure you achieve the same functionality as a “proper” repo, but at a fraction of the cost, both financially and in time savings.