[Part 2] Accelerating Load Times: A Materialized View and Server-side Composition Case Study

Yedidya Schwartz
OwnID Engineering
Published in
8 min readApr 4, 2023
Photo by Mae Mu on Unsplash

In the previous part, I introduced the product, mapped the problem, and detailed the design patterns that will be used to solve the problem: Materialized View and Server-side Composition.

In this part, I will map five objectives that are required from the solution, and then I’ll go over them, step by step, showing you how we built the full solution. At each step, I will show you how the objective is expressed in the architecture. I will explore Redis as a resource caching layer, discuss the essential logic contained in the baking-server, and highlight the impressive capabilities of CloudFront as a CDN, including some of its most intriguing behaviors.

Our Requirements for the Solution

In part one, I presented the current approach of the widget loading, and the main idea of the applied patterns. Now, let’s list the five required technical aspects:

  1. Merge the four network requests into one request that will contain all four requests content.
  2. That single request latency will be very very low, as close to zero milliseconds as possible (i.e., the request should be heavily cached).
  3. If the requested content isn’t ready (i.e, baked) yet, it should be still accessible in real time, in not more than one second.
  4. The response content should always be with the most updated data, with the last settings that were configured by the customer in the management console. Max one to two minutes of delay are acceptable (e.g., if the widget color was changed from red to green, in max two minutes all end-users will start seeing a green widget).
  5. The baked files should support a high scale of requests, tens of thousands per second in the peak time.

Accomplishing these five objectives is a must to declare the mission a success. The main challenge is to examine how to apply them to our existing architecture in such a way that won’t require massive changes.

Figure 1: The architecture of widget loading flow before the change

Furthermore, we can’t risk the existing widget loading approach, as we should support backward compatibility for customers until they migrate to the new solution.

As I explained in part one, the existing widget loading process contains an individual network request per resource. Let’s start to improve it.

Redis as a Resources Cache Layer

As you can see in figure 1, the resources are fetched from S3 and from the multi-tenant server. So as the first step, to optimize the baking process that we are about to develop, we want to make sure that every required resource will be cached in Redis. In this way, the baking-server (described in the next paragraph) will be able to fetch the content it needs to bake in the fastest possible way: from the Redis cache, instead of downloading it from S3, or performing an HTTP request to other services, which are actions that will be much slower.

A separate service — config-server — is the responsible service to load the resources into Redis.

Figure 2: The architecture with the addition of Redis as the resources cache, and config-server, the responsible service for reading the resources from their origins and writing them into Redis

This part is a suitable solution for requirement #3: If the requested content isn’t ready (i.e, baked) yet, it should still be accessible in real time, in not more than one second. Well, as I will describe in the next paragraph, the resources will be accessible to the baking-server from a Redis cache layer, without the need of reading it from S3 or other services, so reading all resources shouldn’t take more than a few tens of ms.

The next step in the architecture is the baking-server, the responsible service for reading all cached resources and baking them into one file that will be served to the end users.

Baking Server

As the solution core, we created a new microservice that owns the baking business. This service has a well defined responsibility: application ID as an input, and a JS file, containing all the required resources as an output.

The logic is very simple: the baking-server reads the relevant resources from Redis — that always holds their most updated versions — and replaces the relevant place-holders in a pre-defined JS structure, that was agreed with our frontend side.

Baking server high level logic

An example of a request to the service:

https://cdn.ownid.com/sdk/jflb8atm2yywr3

Figure 3: The architecture with the addition of the Baking Server, that has the core logic of the solution

This part is a suitable solution for requirement #1: merge the four network requests to one request that will contain all four requests content.

Now, we need to make sure that the baking-server won’t suffer on a traffic peak time; as our customers may have tens of thousands of visitors on their websites, we don’t want our servers to crash under the pressure.

CDN as the Baked Files Cache Layer

As mentioned, the next part of the architecture that I will explain about is the CDN between the end-user and the baking server.

The strategy we chose is pre-baking, but nevertheless, it has an on-demand aspect for the first end-user that accesses that file. Only on the first end-user’s request, the baking-server logic is triggered.

The first request to a file that’s performed by some of the end-users reaches to the baking-server, but the next end-users who will request the file will get it from the CDN.

We are using AWS CloudFront as CDN. Each baked file is stored on the edge locations, and that’s protecting the baking-server (also known as the “Origin”) from getting tons of files baking requests.

To be more accurate: for each one of our customer’s websites, only one end-user per edge location (e.g: one end-user from Chicago, one end-user from London and one from Prague, etc…) will be the one who triggers the baking-server action, and only these end-users will experience the slower loading of the widget.

After they are experiencing the fallback from the CDN to the baking-server, the baked file is stored in the CDN edge location, and from that point it will be served directly from the edge location, without the need of the baking-server, in an average time of 15ms.

Figure 4: The architecture with the addition of a CDN in front of the Baking Server

This part is a suitable solution for requirement #2 and #5: the single request latency will be very very low, as far to 0 ms as possible, and the baked files should support a high scale of requests, tens of thousands per second in the peak time.

The Power of CloudFront as a CDN

I would like to share with you 2 interesting concepts and behaviors of CloudFront that I found very useful.

Requests Collapsing

Let’s assume that during the rush hour of the day, a customer performs an update on our management console, and changes some widget configuration. The “save” action will invalidate the baked file from the CDN (in the next article in the series I will explain the invalidation process in depth), so the next request that will ask to get the file will fallback to the origin — the baking-server.

Now, as it is rush hour, there may be at the same second thousands of end-users that request the file in parallel. Since the file doesn’t exist in the CDN, you would think that the baking-server would get all those thousands of requests, and this would result in a huge load that may cause issues to our backend infrastructure.

Exactly for this edge case, AWS introduced the request collapsing feature as part of the default behavior of CloudFront:

When a CloudFront edge location receives a request for an object and the object isn’t in the cache or the cached object is expired, CloudFront immediately sends the request to the origin. However, if there are simultaneous requests for the same object — that is, if additional requests for the same object (with the same cache key) arrive at the edge location before CloudFront receives the response to the first request — CloudFront pauses before forwarding the additional requests to the origin. This brief pause helps to reduce the load on the origin. CloudFront sends the response from the original request to all the requests that it received while it was paused. This is called request collapsing.

(AWS documentation)

Origin Shield

Another useful feature I want to mention is the Origin Shield. Although we didn’t use it, it’s important to be aware of this option, in case your origin is getting too many requests and you would like to reduce them.

The following paragraph from AWS documentation explains the basics of CloudFront architecture:

With Amazon CloudFront, you inherently get a reduced load on your origin because requests that CloudFront can serve from the cache don’t go to your origin. In addition to CloudFront’s global network of edge locations, regional edge caches serve as a mid-tier caching layer to provide cache hits and consolidate origin requests for viewers in nearby geographical regions. Viewer requests are routed first to a nearby CloudFront edge location, and if the object isn’t cached in that location, the request is sent on to a regional edge cache.

(AWS documentation)

As you can understand — the basic CloudFront architecture contains 2 cache levels: edge locations and regional edges. Later in the documentation, you can read how Origin Shield can give you extra protection on your origin, by adding a third cache layer:

When viewers are in different geographical regions, requests can be routed through different regional edge caches, each of which can send a request to your origin for the same content. But with Origin Shield, you get an additional layer of caching between the regional edge caches and your origin. All requests from all regional edge caches go through Origin Shield, further reducing the load on your origin.

(AWS documentation)

In the following diagram, you can see the fully described CloudFront architecture, with Origin Shield (“AWS Elemental MediaPackage” is the origin — i.e, baking-server in our use case):

Figure 5: CloudFront architecture with Origin Shield (Diagram by AWS)

Summary

In this part of the series I wrote about the objectives that will define the solution as successful. Then, I started to describe the steps taken to apply the design patterns to our architecture.

I started with Redis as a resources cache layer, and then I presented the baking-server, which owns the core logic of the solution. In the last paragraph, I introduced the power of CloudFront as a CDN, and some of its most interesting behaviors.

In the next article — which will be the last in this series — I’ll explain how we verified that the CDN solution works as expected, and then I will add the last missing part of the architecture — Redis pub/sub as invalidation trigger.

You are welcomed to subscribe to OwnID Engineering newsletter to get updated when the next part is published.

--

--

Yedidya Schwartz
OwnID Engineering

Backend Tech Lead | DevOps | AWS Community Builder | AWS Solution Architect