AWS CloudFront & Lambda robots.txt

A little configuration I add to all our web dashboards which are hosted on CloudFront is to disallow search engines from indexing the site. A little security through obscurity on top of the real security. With CloudFront and Lambda this is quick and easy.

CloudFront Configuration

In CloudFront enable a Behavior with the Path Pattern

/robots.txt

Mine looks like this:

Lambda Configuration

The first step is to set-up the node.js script that replies to any request with:

User-agent: *
Disallow: /

This will inform search engines not to index the site.

The node.js Script I use can be found here.

Create a Lambda function in east-1 (CloudFront can access it here) called something memorable.

As part of this it will ask you to create a role if this is the firs time. Crete one with Basic Edge Lambda permissions as shown below:

Next add a CloudFront trigger with the following selections:

  • Distribution ID of the CloudFront instance we’re adding this onto.
  • Cache Behavior as /robots.txt
  • Cloudfront event as “View Request”

Accept the tick-box and save. Wait for a few minutes and the configuration should be live.

Test this by browsing to https://yoursite.com/robots.txt

Like what you read? Give Rob Pope a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.