Image courtesy of Chaval Brasil

Scale, two ways

Size is a funny thing. We’re often told it doesn’t matter, but as an engineer at 383, there’s little else I think about.

At 383, we create outstanding customer experiences by building digital products and services for some of the most innovative organisations. We’ve encountered a number of challenges in building for our clients. I want to describe the architecture we’ve used for two of our recent apps.

First, we built a large e-commerce site which regularly serves thousands of users a day — I know, that’s not massive scale. The site normally runs on two moderately sized servers. From time to time the site has a sale which causes the traffic to grow 100x within an hour. Such a dramatic increase in load causes more challenges than if we expected consistently high load.

The site is very dynamic, there are a lot of filters which can be applied in about 1 x 10⁵ combinations. Aside from heavy caching, there is little we can do to mitigate the load.

We found that using cloud hosting on a pay-as-you-go model and Ansible playbooks to configure the servers is fundamental to our ability to grow with our clients. As a team, we’ve had to adapt and learn some DevOps skills, but we’re now empowered to build the best possible apps.

Our hosting fees are often fractions of what our clients have previously been quoted. We’re nimble and can add or remove servers on a whim. Cloud hosting has given us more control than would have been possible just a couple of years ago. In an environment where servers are billed by the hour (or minute in some cases), we shouldn’t be constrained by traditional procurement processes.

Architecture for brute force scale

The only way to scale this app is to increase the compute capacity of the system.

Second, we built a portal which is served to millions of users per day. If we used the same design here, we’d quickly lose ourselves in the number of servers required.

We’re fortunate though, the data doesn’t change very often and isn’t very dynamic. We took the decision to offload all of the challenges of high traffic to a content delivery network (CDN). CDNs have lots of servers around the globe which serve static files really quickly.

We wanted to see if we could build every version of the page and serve the entire site without incurring any load on the server. This felt like the most efficient way of solving this problem.

We want to shift as much load “offline” as possible. Some, if not most, things don’t need to happen in real time. We can compute results in batches rather than doing it just in time. Our data can be slightly stale and no one will notice or care.

We build all combinations of the page (1 x 10³) and allow the CDN to quickly and efficiently provide these pages to our end-users. We periodically enqueue a job to re-build a page to ensure the data never becomes too out of date. A single build server pulls jobs off the queue, generates the page and provides it to the CDN to serve.

Architecture to scale using an offline build

We scaled this app by spreading the load out over time.

There are two types of scale at play here. One requires sheer brute force and the other a more finessed approach. Thinking at scale has forced us to adopt new tools and techniques, and to really consider how people will interact with our apps.

Eric Schmidt of Google, arguably the masters of scale, said

You’ve got to have products that can scale. What’s new is that once you have that product, you can scale very quickly. Look at Uber.

In recent years we built good websites for our clients but we didn’t need to even consider scale. Lately, as we’ve been building outstanding customer experiences, scale has been at the front and centre. We simply couldn’t deliver the quality of product if we didn’t think about how it would work for the 1ˢᵗ user and the 100,000,000ᵗʰ user.

These approaches aren’t one-size-fits-all. You need to consider each case on its own merits to really understand which parts needs to be grown vertically (more resources on a single node), which can be grown horizontally (more nodes with moderate resources), and which need something more creative.

You may be surprised at the performance you can squeeze out of your product by masking where some of the work is done.