S3 Object URL Signing: Living on the Edge (of AWS CloudFront)

Published in

MyHeritage Engineering

5 min readOct 19, 2022

In this post, we will tell why and how we switched from AWS S3 presigned URLs to custom signing solution on AWS Lambda@Edge with 15 msec p99 additional latency.

How it started

At MyHeritage, where we serve billions of assets to our users, we use private AWS S3 buckets to store our users’ assets. These could be assets uploaded by users, or derivatives from these assets produced by our awesome features like Deep Nostalgia™ or DeepStory. To serve assets from buckets, we used to generate AWS S3 presigned URLs with AWS CloudFront distribution in front for caching:

Reasons why we decided to look for another solution:

We wanted to generate signed URLs which are valid for an arbitrary period of time, but with AWS SDK the maximum expiration time for a presigned URL is 7 days from the time of creation (see Using the AWS SDK note). This could be handy if the user wants to share this asset on social media, or bookmark it.
For security reasons, we wanted our application to have short-lived AWS credentials. But the access key ID is a part of the signature, and that means the signature (and thus, the URL) is valid at most as long as your temporary credentials are.

Roads? Where we’re going, we don’t need roads.

The most obvious off-the-shelf replacement for S3 presigned URLs was… drum roll… CloudFront Signed URL! It ticks all boxes, but it has its own fly in the ointment: an RSA key pair for signatures. Due to its asymmetric nature, it takes more time to be calculated, and when we serve tens of signed URLs in one request, it could impact our application performance.

Ok, so if we can’t use provided solutions, we can just create our very own signing and verifying solution. While it’s pretty simple to create a signature in our application, the question is how to verify it.

First contender for the title of “Signature Verifier of the Year” was CloudFront Functions. It looked like a cost-effective and elegant solution:

They sit very close to the user. Functions run in CloudFront Edge cache locations, and at the time of writing, there are 218+ locations.

Edge Location VS Regional Edge Location. Source: aws.amazon.com

2. They are very efficient. CloudFront Function is a simple ECMA JS script with just 2 built-in libraries: crypto (which is what we need for signature verification) and querystring, and execution time is capped at 1 msec — a fact that makes it invisible to users in terms of added latency.

3. They are very inexpensive. You pay only for invocation (no duration billing), and it costs $0.10 per 1 million invocations.

But there is no free lunch: you have to pay for the simplicity and speed.

You can’t inject any params (like environment variables with AWS Lambda) into the function.
The function has no network access — you can’t use other AWS/non-AWS services.

That was a moment of disappointment, because it meant we should deliver a signature verification key as a part of the function’s code, which was a no-go for us. As a final goodbye, we submitted a feature request to the AWS team with fingers crossed that this gap will be covered by the next re:Invent.

We should go deeper

After our defeat at CloudFront Edge Location from CloudFront Function, we decided to retreat one level deeper: CloudFront Regional Edge Location, and face Lambda@Edge as a candidate for Signature Verification. It was a familiar experience for us, and the only question was where to place the Lambda. When we’re talking about signature verification by Lambda@Edge, we have two options for where this verification can be triggered: viewer request and origin request.

Lambda@Edge possible triggers. Source: aws.amazon.com

Origin request:

😀 Lambda@Edge is triggered only when the asset is not in CloudFront cache — lower cost

☹️ The user gets a cached asset even with an expired signature, as the expiration of each signature is validated in our Lambda

Viewer request:

😀 If the URL has an expired signature, Lambda@Edge denies access

☹️ Lambda@Edge is triggered on every request — higher cost

Our choice: origin request. We can tolerate cache hits for expired URLs up to the CloudFront’s cache TTL, and in case of emergency we can purge the cache manually for selected assets. Another option is to calculate the Cache Control max-age value according to expiration and set the Cache Control header — CloudFront respects it.

How it’s going

Once we settled on our signature verifier, there were two additional questions to be answered:

How should we allow access for verified requests to S3 bucket?
Where should we store the signature verification key?

Access to S3 Bucket

One option is to sign origin requests with the same S3 signature in Lambda@Edge, but a better option is to use Origin Access Identity with CloudFront. While this setup is pretty obvious from the instructions, we saw Access Denied by S3 Signature verification when we turned it on. The missing pieces were:

You can’t have any query parameters which might trigger S3 signature verification (X-Amz-* params) in the origin request
You can’t have a Host header in origin request (it triggers S3 signature verification as well)

Signature Key Storage

At MyHeritage, we typically use Hashicorp Vault for secrets management, but to reduce Lamda@Edge complexity, we opted for SecureString in AWS System Manager Parameter Store. A call to this service adds latency, but Lambda can handle multiple requests per execution context, and if you cache the key, latency is amortized across all invocations.

And that’s how we’re going with millions of invocations per day and 15 msec p99 duration.

S3 Object URL Signing: Living on the Edge (of AWS CloudFront)

How it started

Roads? Where we’re going, we don’t need roads.

We should go deeper

How it’s going

Written by Gena Kartashevskyy