Create a validation step to decide whether CodeDeploy triggers a task revision update

Francisco Ferreira
Aug 20 · 5 min read
Photo by Nicholas Swanson on Unsplash

This post assumes that you have a good knowledge of AWS services, such as CloudFormation, ECS, and code (build/deploy/pipeline), so I won't deep dive into any resource as this is not the purpose of this post. That said, you know what to do.


Let's Picture This

You have created a CloudFormation template that will spin up an EC2 Cluster to run some microservices and CodePipeline to push images to ECR and deploy an update to ECS.

Everything looks really smooth up to now.

This is the first time that you're deploying your template, everything starts fine, the EC2 instances are created, the ECS cluster is created, the ECR repo is created, the ECS service and task definition are created, all stages of CodePipeline are created…

But, after a while, you noticed that your CloudFormation template is still running and it never completes. Hours later everything is rolled back and you don't know why.


The Problem

When deploying an ECS microservice, you need to have a task definition pointing to that Docker image in an ECR repo and a service with task definition configurations.

This workflow looks fine but we are missing one step — how are we going to push that Docker image if we have a full automated template?

Well, our CI/CD will handle that. Our CloudFormation template already created everything for us, the integration with git to pull code, the CodeBuild to build and push Docker images to ECR, the CodeDeploy to trigger a task definition revision, it's all perfect!

Not that fast…

Writing your infrastructure as a code using CloudFormation can be tricky sometimes, you can't perform validations during resources creation that could lead to race-conditions and deadlocks.


The Real Problem

CloudFormation provision resources are based on conditions and dependencies (you already know that).

Based on your template, microservices and CI/CD resources will be created in parallel and, following the problem described above, we have two scenarios:

Scenario 1

The task definition will be created before the CI/CD build stage is completed, which will cause the task definition to be unable to stabilize, due to a missing image at ECR.

Fortunately, task definition will keep trying until it finds an image or the CloudFormation requests a rollback. The CI/CD build stage is finished and task definition found an image to run and then CI/CDs deploy stage triggers a task definition revision update.

Maybe you see the problem now. Not yet? Don't worry, keep going.

Scenario 2

The CI/CD build stage finished first, before the task definition creation, which is really good.

Task definition will have a Docker image available to stabilize the service but we still have the CI/CD deploy stage to run and it will trigger a task definition revision update.

If the deploy stage runs before the task definition creation, it will fail the CodePipeline execution but it will not fail the CloudFormation stack.

This is where the race-condition/deadlock relies on.

When a CloudFormation creates a task definition, it will start with a code revision of one and will wait for its stable signal to complete the stack creation.

But, if you have CI/CD running in parallel and you’re triggering a task definition revision update, it will raise the code revision to two, causing CloudFormation to wait forever for the code revision one to stabilize.


Solution

Finally the solution, the most exciting part.

I tried every single combination of resource order to solve this race-condition problem, until I realized that the main problem was with the deploy stage, so as soon as I fix the deploy stage, this should work just fine.

If you followed the official ECS Continuous Deployment Tutorial, you should have something like this inside your buildspec.yml file

version: 0.2phases:   
pre_build:
commands:
- echo Logging in to Amazon ECR...
- $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
- REPOSITORY_URI=${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/${ECR_REPO_NAME}
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- IMAGE_TAG=${COMMIT_HASH:=latest}
build:
commands:
- echo Building the Docker image...
- docker build -t $REPOSITORY_URI:latest .
- docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
post_build:
commands:
- echo Pushing the Docker images...
- docker push $REPOSITORY_URI:latest
- docker push $REPOSITORY_URI:$IMAGE_TAG
- echo Writing image definitions file...
- printf '[{"name":"${ECR_REPO_NAME}","imageUri":"%s"}]' $REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
artifacts:
files: imagedefinitions.json

My solution was creating a validation step to decide when CodeDeploy will trigger a task revision update and when not.

This is achieved by not providing an imagedefinitions.json at the end of execution.

Ok, but… what are the criteria here to decide that?

I came up with the idea to check if the repository already has a Docker image during the execution. If the repository has at least one image, this is not the first run. AWS CLI will do the job for us here.


Check It Out

Query ECR repository images with the following command:

aws ecr list-images --repository-name ${ECR_REPO_NAME} --max-items 1

We only need one item to decide if the repository has an image and the option --max-items 1 will speed up the response.

The result of this command will be a JSON with a top object called imageIds, which will be an array of objects or an empty array.

{     
"imageIds": [
{
"imageDigest": "sha256:236ce1ed44...",
"imageTag": "latest"
}
]
}
or{
"imageIds": []
}

With that result, we now have all the data we need to validate our first run.

I chose the value imageDigest as a key to decide if the repository is empty or not. If you run the command with more items you will see that some images don't have the imageTag property, making the imageDigest value more guaranteed here.

Now, what we need to do, is create a condition to provide, or not provide, the imagedefinitions.json file.

version: 0.2phases:   
pre_build:
commands:
- echo Logging in to Amazon ECR...
- $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
- REPOSITORY_URI=${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/${ECR_REPO_NAME}
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- IMAGE_TAG=${COMMIT_HASH:=latest}
build:
commands:
- |
echo Checking Repository Images
REPOSITORY_IMAGES=$(aws ecr list-images --repository-name ${ECR_REPO_NAME} --max-items 1)

echo Building the Docker image...
docker build -t $REPOSITORY_URI:latest .
docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
echo Pushing the Docker images...
docker push $REPOSITORY_URI:latest
docker push $REPOSITORY_URI:$IMAGE_TAG
# Validate if JSON response has imageDigest value
if echo "$REPOSITORY_IMAGES" | grep -q "imageDigest"; then
echo Repository contain images and Deploy should run
echo Writing image definitions file...
printf '[{"name":"${ECR_REPO_NAME}","imageUri":"%s"}]' $REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
else
echo Repository DOES NOT contain images, DO NOT run deploy
fi

artifacts:
files: imagedefinitions.json

Changes Breakdown

REPOSITORY_IMAGES=$(aws ecr list-images --repository-name ${ECR_REPO_NAME} --max-items 1)

The $(command) syntax will store the result inside the variable. Even if the result is in JSON format, it is still plain text, allowing us to execute string functions.

if echo "$REPOSITORY_IMAGES" | grep -q "imageDigest";

The grep command allows us to search files for a text and the echo "$VALUE" | will simulate a file for grep, returning 0 if found and 1 if not found.

Using this approach inside a conditional, we can handle when to create or not create an imagedefinitions.json file. The existence of this file will determinate if CodeDeploy will run or not.


Conclusion

Nothing further to say, I hope the scenario was well-explained and the solution helps you.

You can find the build spec file in my GitHub account.

Better Programming

Advice for programmers.

Thanks to Tainara Specht

Francisco Ferreira

Written by

An Italo-Brazilian guy, passionate for AWS services and software engineering. Stitch addicted. Ohana.

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade