Chaos Engineering: How to Create an Automated Chaos Gauntlet for Staging (Gremlin, Jenkins, AWS CodeBuild, AWS CodeDeploy, and AWS CloudFormation)

Tammy Butow
Chaos Engineering
Published in
8 min readOct 22, 2020

In this tutorial, we will demonstrate how to use Jenkins to create an automated chaos gauntlet. This will be done using Jenkins Pipelines and Stages to inject a controlled amount of failure with Gremlin. We then add a final stage that allows you to optionally halt the attack from the pipeline, rather than having to wait for the full duration of the attack.

Gremlin can be easily integrated into your Jenkins pipelines using the Gremlin API

If you’d like to see a video of this in action before you create it yourself, you can view it below:

Prerequisites

Before you begin this tutorial, you’ll need the following:

Step 1 — Spin-up a sample application and build-deploy pipeline with AWS CodeBuild, AWS CodeDeploy, and Jenkins

In this step, you’ll use a Cloud Formation template to spin up a sample application and a build-deploy pipeline:

First, launch and run the Cloud Formation template below:

https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateURL=https://blog-for-codedeploy.s3.eu-central-1.amazonaws.com/CodeDeployTemplate.json

Choose a stack name that you prefer and fill in the configuration parameters as described below:

Next, click to launch the Cloud Formation template and view the outputs tab for the information you will need in the following steps.

Step 2 — Configure Your Jenkins Instance

In this section, I discuss how to access, unlock, and customize your Jenkins server.

  1. Copy the JenkinsServerDNSName value from the Outputs tab of the CloudFormation stack, and paste it into your browser.
  2. To unlock the Jenkins server, SSH to the server using the IP address and key pair, following the instructions from Unlocking Jenkins.
  3. Use the root user to Cat the log file (/var/log/jenkins/jenkins.log) and copy the automatically generated alphanumeric password (between the two sets of asterisks). Then, use the password to unlock your Jenkins server, as shown in the following screenshots.

Step 3— Add Gremlin API Keys To Jenkins

In this step, you’ll enter your gremlin-api-key credentials into the Jenkins instance.

Open the following in your browser:

From Jenkins > Credentials > System > Global credentials (unrestricted), set the Kind and Scope as shown here. Then, enter your Gremlin Secret. Enter `gremlin-api-key` as the ID. Click OK to save.

Step 4— Create your Chaos Staging Pipeline

In this step, you’ll create a Jenkins pipeline that uses the Gremlin API to run a Chaos Guantlet (series of Gremlin attacks) on your staging environment.

Click New Item > Pipeline and enter the name “chaos-staging-pipeline”. Add the source for your application/code, e.g. https://github.com/tammybutow/jenkins-gremlin.

Add your source (github project)

Scroll down to Advanced Project Options and add in the Pipeline script that will be used to call Gremlin.

Add your pipeline script (see the example below)

This is an example of a Pipeline script for your Chaos Staging Pipeline. It will run a Chaos Gauntlet that includes a CPU attack.

pipeline {
agent none
environment {
ATTACK_ID = ''
GREMLIN_API_KEY = credentials('gremlin-staging-api-key')
}
parameters {
string(name: 'CPU_LENGTH', defaultValue: '1000', description: 'Duration of CPU attack')
string(name: 'CPU_PERCENT', defaultValue: '50', description: 'Percentage of CPU attack')
string(name: 'CPU_CORE', defaultValue: '4', description: 'Number of cores to impact')
string(name: 'TARGET_IDENTIFIER', defaultValue: '172.31.56.245', description: 'Host to target')
}
stages {
stage('Chaos Staging Gauntlet') {
agent any
steps {
script {
ATTACK_ID = sh (
script: "curl -s -H 'Content-Type: application/json' -H 'X-Gremlin-Agent: jenkins' -H 'Authorization: Key ${GREMLIN_API_KEY}' https://api.gremlin.com/v1/attacks/new --data '{ \"command\": { \"type\": \"cpu\", \"args\": [\"-c\", \"$CPU_CORE\", \"-l\", \"$CPU_LENGTH\", \"-p\", \"$CPU_PERCENT\"] },\"target\": { \"type\": \"Exact\", \"hosts\" : { \"ids\": [\"$TARGET_IDENTIFIER\"] } } }' --compressed",
returnStdout: true
).trim()
echo "see your attack at https://app.gremlin.com/attacks/${ATTACK_ID}"
}
}
}
stage('Observe and/or Halt Chaos') {
agent any
input {
message 'Do you want to halt attack?'
parameters {
choice(choices: ['yes' , 'no'], name: 'HALT', description: '')
}
}
steps {
script {
if (env.HALT=='yes') {
sh "curl -s -X DELETE https://api.gremlin.com/v1/attacks/${ATTACK_ID} -H 'X-Gremlin-Agent: jenkins' -H 'Authorization: Key ${GREMLIN_API_KEY}' --compressed"
}
}
}
}
}
}

I also recommend adding more attacks to your Chaos Gauntlet, e.g. Disk attack, IO attack and Memory attack. You can use the built-in Gremlin API Examples feature to create the curl commands to run these attacks.

Now you are ready to Build your Staging Chaos Pipeline. To get started, click Build. You will see the Chaos Gauntlet commence running the attacks.

Staging Chaos Pipeline

Now that we’ve created our Staging Chaos Pipeline, let’s integrate it with the Build-Test-Deploy pipeline for our sample application.

Step 5 — Create a Build-Test-Deploy Jenkins Pipeline

First, we’ll create a Freestyle project in Jenkins. Click Jenkins, then New Item and enter the name for your pipeline, e.g. “Demo-Pipeline”. Then select Freestyle Project.

Now choose Git as the Source Code Management and enter your repository URL, for example you can use https://github.com/tammybutow/jenkins-gremlin.git. Change the Branch Specifier to be **.

Now we’ll set up the Build Triggers. In this example we’ll use Polling and we’ll have Jenkins poll the GitHub repo for changes every 2 minutes. We could also use the GitHub hook trigger by setting up WebHooks for our repository. As the Poll schedule, enter H/2 * * * *

Set the poll schedule

Select the option to delete the workspace before the build starts under Build Environment.

Next, we’ll configure the Build steps with Jenkins. Under AWS CodeBuild select manually specify access and secret keys. Navigate to the AWS IAM console to grab your AWS Access Key and AWS Secret Key. Then enter the region you created your CloudFormation stack in, e.g. us-east-1 and enter your project name. The project name is the name of your CodeBuild project name which you can find in your CloudFormations outputs. Lastly, check to Use Jenkins source as the project configuration.

CloudFormation Stack Outputs

Now you’re ready to add the next steps of the build. Scroll down and click to add a build step, select File Operations:

Next, add the File Delete File Operation:

In the Include File Pattern field, choose everything by using *

Now add a Build Step that is a HTTP Request, use the S3 artifact as the URL, eg. https://jenkins-codedeploybucket-kjjafe4o06iv.s3.amazonaws.com/codebuild-artifact.zip. This URL path was created for you when you launched the CloudFormation stack. You’ll find your URL path in the CloudFormation Outputs tab. For example, mine is below:

Check yes next to Ignore SSL errors.

Now we’ll add our final build step, click to add build step, then select file operations and choose Unzip. Enter the following as the zip file, this will be your build artifact location: codebuild-artifact.zip

Now we are ready to configure our Post-Build actions. Click to add a step and choose “Deploy an application to AWS CodeDeploy”.

We’ll be using several items from your CloudFormation Stack outputs to finish this step. Enter your AWS CodeDeploy Application Name, AWS CodeDeploy Deployment Group, AWS CodeDeploy Deployment Config, AWS Region and S3 Bucket. Leave the Include Files setting as it and add 0 as the Proxy Port. Select to Deploy Revision and click to Wait for deployment to finish. You can leave the settings as defaults.

Step 6 — Automatically Trigger Your Chaos Pipeline

Now we are ready to connect our Build-Test-Deploy Pipeline to our Staging Chaos Pipeline. Click to add a post-build action and choose “Build Other Projects”. Scroll up to find this item and drag it under your deployment step. We want this Chaos Gauntlet to trigger after you deploy your application to Staging. Enter the name of your previously created Chaos Pipeline, e.g. staging-chaos-pipeline as the project to build. Leave the option set to “Trigger only if the build is stable”.

Click save, you are now ready to run your Build-Test-Deploy pipeline that then automatically triggers your Staging Chaos Pipeline.

Step 7 — Learning from your Chaos Pipeline

A few examples of learnings you can gather from running your Chaos Pipeline is to validate that autoscaling is functioning as expected for your clusters. We can also use this Chaos Gauntlet to validate that autoscaling is working as expected. View the video below to see a demo of this in action:

Conclusion

This tutorial walked you through how to create a Chaos Pipeline for Staging that runs an automated Chaos Gauntlet. This makes use of the Gremlin API, Jenkins, AWS CodeBuild, AWS CodeDeploy, and AWS Cloudformation. We used Jenkins to create these CI/CD pipelines but you can also use Gremlin with other tools such as AWS Pipelines and Spinnaker.

--

--