How does Radon use AWS to power its home page

Published in

Rn Engineering

7 min readApr 3, 2018

We have recently released our company web page after several months of discussing, designing, and developing. We decided to have a Single Page Application (SPA) with React and have a custom CMS to power it with content. During this process, we learned many things about SPAs and how to serve it through Amazon Web Services (AWS). You can see the end result by visiting our web page:

House of Radon

House of Radon - A Stockholm based agency combining creativity, strategy, craft and venture under one roof.

houseofradon.com

Background

We chose React and Redux to build our SPA, as many others do recently. We started the project with create-react-app which adds service worker configuration by default and makes both development and deployment easier.

We also decided to go with Amazon S3 to serve our site without any server side rendering. We use CloudFront in front of the S3 bucket to leverage global caching (and more).

We also built a backend serving as a CMS to let people manipulate content that is visible in the website. We used Clojure and MongoDB to achieve this. I won’t go in to the backend details, because it’s not the topic of this article. The relevant part in backend is that we also serve our user facing endpoint through CloudFront in order to leverage caching on less frequently changing content in the backend. This reduces both the load on our services and latency on the client.

Towards the end of our project, we wanted to spice up the pages with some metadata so that social media can be aware of what’s going on that page when shared. A task that sounds like a 10 minute job turned out to be a real challenge.

Metadata Challenge

We didn’t realize it was this hard to add metadata to our pages before we actually tried it. The problem here is the fact that our SPA has only one HTML page, and all of the routes and individual pages are generated through javascript. When a crawler visits an individual page, they won’t run the scripts. Instead, they will just come and take whatever metadata your page has, but you haven’t generated them yet!

We also couldn’t pre-render our pages and place them under specific folders, because the data is coming from our backend CMS. If we pre-render and upload pages now, they will be outdated, once someone updates the data in the backend.

Solution: Lambda@Edge

I must admit that we even thought about setting up a Node.js server somewhere in the cloud and do server side rendering. That was before we found out about Amazon Lambda@Edge. After hours of research and reading through documentations, we found a way to solve our problem without running a server to render our pages.

Lambda@Edge is a relatively new service of Amazon where you can attach Lambda functions to 4 different types of triggers on CloudFront. Once set it up, these Lambda functions replicate across CloudFront’s edge locations and run at the edge location when triggered. So, how does this help us to solve our metadata challenge?

The problem definition is simple: serve an HTML file with correct metadata when a crawler visits your page. In order to do that, we used CloudFront’s Origin Request to trigger our Lambda function. Origin Request is one of the four triggers you can attach your Lambda function. These 4 triggers are:

Image Source: Amazon Lambda@Edge Documentation

Origin Request: Triggered before CloudFront redirects request to origin.
Origin Response: Triggered after CloudFront receives a response from origin
Viewer Request: Triggered when CloudFront receives a request from viewer
Viewer Response: Triggered before CloudFront returns response to viewer

You can think of Origin Request/Response as triggers that are called when you don’t have a valid cached resource at the CloudFront edge but you want to create one. On the other hand, Viewer Request and Response are triggered everytime a user tries to fetch something from CloudFront edge.

In our case, if we trigger our Lambda function once when we need to cache, we can use same file as long as the data is not changed. Our backend automatically invalidates all related resources on CloudFront, when something is altered.

We first tried to use Origin Response. We thought that we can inject metadata into response from S3. However, Lambda@Edge does not let you alter response body in Origin Response. After realizing that, we decided to modify Origin Request and fetch actual HTML file from S3 inside the Lambda function, add metadata and return it back to CloudFront without letting CloudFront to forward request to S3.

Final Cloud Architecture with Lambda@Edge

Implementation

Another challenge we faced while trying to make Lambda work was lack of examples. We know that Lambda@Edge is a relatively new product from Amazon, thus we couldn’t find a good example of using it for adding metadata to an SPA.

Amazon only lets you use Node.js when you are writing Lambda@Edge functions. We first started with a basic function that we can use to download content from either S3 or our backend.

Function to download content from a source

Above function does a simple job: get the content from given url and call the callback function with body and headers. We also check response header to see if the content we are downloading is gzipped or not. If it’s gzipped, we unzip the response.

Next, we needed to decide which requests of CloudFront we want to intercept. In our case, we decided to intercept all requests which do not have any extensions. This will help us in two ways. First, we’ll be able to serve static assets without interrupting the call. Second, we’ll be able to redirect to main page if a user tries to reach an unknown route. We used the following logic for this purpose:

Basic Lambda handler function demonstrating extension check.

This is a regular Lambda handler with event, context and callback as arguments. We get CloudFront request from event and parse it’s uri with path library. This lets us check extension of uri easily. For example, if you have a uri in the form of /path/to/page it will return empty string as extension. However, if you have a uri with a similar format to /path/to/static/asset.js, then it will return ‘js’ as extension.

Once we know we need to intercept the request and inject metadata to response, we can proceed. We split the upcoming process into two parts. First, we need to get metadata from our backend that is relevant to this specific uri we are dealing with. We do this by using the following function:

Function that fetches metadata from given url and returning a string with meta tags.

This function above gets the relevant metadata from our CloudFront cached backend, parses the JSON response, and constructs a string with relevant HTML meta tags. It calls the given callback function with constructed meta tags and the headers. We’ll use these headers in the next step to understand if metadata is changed or not.

The next step is to get our index.html from S3 and inject the meta tags into it. For this purpose, we use the following function:

Function that fetches index.html from S3 and constructs CloudFront response with metadata injected index.html body and headers.

This looks like a long function but actually it does not do that much. We replace the title tag in our original HTML file with our metadata from previous step and then gzip the content again. Finally, we set the correct headers for the response. The trick here is the ETag. We wanted to have a consistent ETag for our index.html. In order to do that, we combined the ETag of the original index.html with ETag of the metadata response. By doing this, we’ll know that ETag will be updated if either index.html or metadata changes in the future.

To combine these functions together, we use the following Lambda handler:

Final Lambda handler

What we do here is that we decide which url we are going to call on our backend by looking at the request uri we are currently handling. Once we got the metadata from backend, we also fetch index.html and if everything is okay, we call the actual Lambda callback with the new response. If anything goes wrong for any reason, we call the callback with original request but uri is always index.html.

To sum up the entire Lambda@Edge function, you can check out the following gist:

Source code for Lambda@Edge function to inject metadata into an SPA

Conclusion

We tried to solve social media sharing problem in our Single Page Application with dynamic content by using Lambda@Edge. In the end, we were satisfied with the solution, because we didn’t need to implement server side rendering.

We chose Lambda@Edge because it fits our needs. Also we wanted play with it to see what else we can do. It is still a bit of a painful process to develop a function for Lambda@Edge, since everytime you deploy a new function, it needs to propagate to all nodes around the world before you can see your changes live.

After all, we had a nice experience with Lambda@Edge. It took us a couple days to figure out and implement everything. We can definitely recommend this solution if you have the same problem but does not want to use server side rendering.

This was an experiment which ended up well. If you have thoughts about this or example solutions to the same problem, we would like to hear them! We also had another solution with Lambda@Edge to another project where we had a static Single Page Application. You can also check that out if you don’t have dynamically generated content on your SPA.

If you would like to work on important, hard, fun, boring, and challenging projects with using AWS, React, Clojure and many other cutting edge technologies, feel free to get in touch with us!