Look Mum, no servers!

Rowena Parsons
Sainsbury’s Tech Engineering
4 min readJul 2, 2018

Here at Sainsbury’s Digital & Technology, we’re keen to ‘Live Well For Less’ by minimising the amount of static infrastructure we spin up in our cloud environments. In a recent sprint, our ‘Labs and Algorithms’ team was quite happy with a serverless solution they found to a seemingly simple requirement. It’s not rocket science, but a practical pattern so we decided to share it.

Pass the parcel

One of our partner organisations regularly sends us some information updates by placing files in an S3 bucket. As these files arrive, we need to copy them into a bucket in our own production environment for further processing. One of our developers had been doing this manually using the AWS CLI, which only takes a minute but is one of those annoying little distractions that we like to automate away.

So: copy data from one S3 bucket to another whenever a new file arrives, without having a server sitting idle in between. Must be an easy way to do that with AWS Lambda, right? It can execute for up to 5 minutes which is ample for this simple file copy. Here’s the help on ‘Using AWS Lambda with S3’,

Using notification settings on the bucket, you can get S3 to create an event whenever a new file is dropped into the bucket. Write a function that responds to this notification by finding the new file and copying it into the target bucket. S3 can then invoke your Lambda function by passing the event data as a parameter. So the architecture of our first iteration looked something like this:

Jeux avec frontières

One small problem though: you will notice that the AWS diagram above clearly shows a single AWS account as its boundary. In our case, where the source bucket and the target bucket have different owners and are in different AWS accounts, we found that this architecture doesn’t work.

Actually, we initially set up our test in a single account. We had some logging enabled but it was incomplete — the Lambda function was not outputting to CloudWatch. We figured out how to fix that by adding AWS X-Ray, a service that integrates with AWS CloudWatch, to our IAM Policy. It was simply a matter of attaching the following two policies to the Role that our Lambda is using.

{ “Version”: “2012–10–17”,
“Statement”:
[{
“Effect”:
“Allow”,“Action”: [ xray:*” ],
“Resource”: [ “*” ]
}]
}
{ “Version”: “2012–10–17”,
“Statement”:
[ {
“Sid”: “VisualEditor0”,
“Effect”: “Allow”,
“Action”: “logs:*”,
“Resource”: “*”
}]
}

The logging enabled us to debug our function and handle some exceptions in the code. Eventually, when the test file was placed in the test source bucket, the Lambda function would trigger, find the file, and copy it to the test target bucket. But then we realised the test was invalid because we had created everything in one account. So we set up a new test across two accounts and found that the user on the Third Party account couldn’t access the Prod account bucket. This was the error:

An error occurred (InvalidArgument) when calling the PutBucketNotificationConfiguration operation: Unable to validate the following destination configurations

Even though the Policy looked correct, Lambda would not perceive the S3 notification event. So we googled some more and discovered that in order to provide cross-account access, we needed to introduce an intermediary.

As the Configuring Amazon S3 Event Notifications help says under Supported Destinations, in addition to Lambda, Amazon S3 can send event notification messages to the following destinations…

• to an Amazon Simple Notification Service (Amazon SNS) topic

• to an Amazon Simple Queue Service (Amazon SQS) queue

We chose to use Amazon SNS as the intermediary, because it’s simpler and cheaper than SQS and gave us what we needed. So our new architecture looked like this:

One more interesting thing

Although we are using Terraform extensively for AWS right now, in this case we reduced the amount of code we had to write by using https://serverless.com/framework/, a framework for serverless functions that’s cloud provider independent.

And that’s it!

We hope you enjoyed reading about our rookie errors and that this blog may be helpful to other teams. Huge thanks to Marek Fengler and Christopher Cooke who developed this solution.

--

--