AWS CloudFront & Lambda robots.txt
A little configuration I add to all our web dashboards which are hosted on CloudFront is to disallow search engines from indexing the site. A little security through obscurity on top of the real security. With CloudFront and Lambda this is quick and easy.
In CloudFront enable a Behavior with the Path Pattern
Mine looks like this:
The first step is to set-up the node.js script that replies to any request with:
This will inform search engines not to index the site.
The node.js Script I use can be found here.
Create a Lambda function in east-1 (CloudFront can access it here) called something memorable.
As part of this it will ask you to create a role if this is the firs time. Crete one with Basic Edge Lambda permissions as shown below:
Next add a CloudFront trigger with the following selections:
- Distribution ID of the CloudFront instance we’re adding this onto.
- Cache Behavior as /robots.txt
- Cloudfront event as “View Request”
Accept the tick-box and save. Wait for a few minutes and the configuration should be live.
Test this by browsing to https://yoursite.com/robots.txt