Replay AWS CloudFront Logs to Load Test Your Website

Danielle Fenske
PayScale Tech
Published in
3 min readAug 6, 2019
Photo Credit: Unsplash

Motivation

We have a NextJS application that runs on a Docker container in Azure Kubernetes Service (AKS). And we ran into a problem: things would be running fine in all environments, right up until we actually deployed to production. Then we would see strange errors, intermittent 504s, and other odd behavior. We were wondering then… how can we catch these performance issues and other obscure bugs before the code reaches the production environment?

Idea

We wanted a way to mimic the amount of traffic we’re getting to production in our dev environments. We could just hit a few different URLs over and over again… but that wouldn’t really be representative of what is actually going on in production. Then we thought — what if we could just pull a set of the requests to our production site and replay them against our site in the dev environment?

Solution

The way I went about doing this was simple once we had a vague idea of what we wanted. I wrote two Node scripts that do the following:

Script #1: Pull an hour’s worth of CloudFront logs from S3

The first script is named download-cloudfront-logs.js. It accepts parameters that specify where in AWS to find the logs you want to use. It also accepts a timestamp which specifies the specific hour for which you want to download the logs.

To run the download-cloudfront-logs.js script, I run this command:

node ./download-cloudfront-logs.js --timestamp 2019–08–05–01 --bucketName <the-bucket-with-the-logs> --s3folder <the-folder-in-the-bucket> --cloudfrontId <the-cloudfront-distribution-id>
The output from running ./download-cloudfront-logs.js

Script #2: Replay the logs against a different URL

Replay those logs against a given URL (the dev environment) at the same interval that the requests were originally made.

The script that executes the requests displays in the console a color coded list of status codes that it received from our app. If we see any red lines (5xx status codes), that’s an instant clue that we have a problem in our app.

To run the load-test.js script, I run this command:

node ./load-test.js --maxPings 100 --urlHost https://www.<dev-environment-url>.com --prefix ‘/research/’

which is saying “replay just the first 100 requests against our dev environment, and only replay the requests that begin with ‘/research/.”

The CloudFront logs are read into memory, sorted in order of timestamp, and then replayed at the correct intervals

Other Uses for this Script

We also have started using this script to just look for bugs in our local environment when we are writing new code. I can run this script against my local environment and watch the application logs and look for exceptions being thrown. I have found cases where I forgot to define a variable or forgot to add a null check, which only occurred in special situations and which I wouldn’t have been able to catch if I hadn’t thought to check a specific URL.

Github Repo

--

--

Danielle Fenske
PayScale Tech

I am a Software Engineer at PayScale, a crowd-sourced compensation software company. I focus on creating attractive and reliable front-end web apps.