Scaling explained to my mum: a restaurant analogy

Published in

Unibuddy

7 min readJan 25, 2022

Welcome to the first instalment in the series ‘Non-tech blogs on a tech blog’. I’ve always thought that technology, and especially software engineering, is actually just common sense, and anybody could understand what we’re doing.

The problem is, that engineers are often used to talking about technical subjects to fellow engineers, meaning that they use lots of technical terms and acronyms that inhibit anyone from understanding the most basic of issues.

Today, we are going to tackle a central problem for most engineers who work at start-ups / scale-ups like Unibuddy: how to scale.

Scaling means being able to host a number of concurrent users on your website at any point in time. Perhaps at a moment’s notice, you’ve been published in a national newspaper and suddenly have an influx of readers on your site, or like Unibuddy, you’re growing your presence internationally and have thousands of new applicants connecting to your platform every month.

So let’s explain the concept of Scaling using a topic I love, and that most people need in order to survive: food.

Let’s imagine a restaurant…

A customer enters the room, sits down at a table, and asks the waiter for a menu. The chef is going to source the ingredients, cook whatever the customer orders, and the waiter is going to deliver that to them.

If we switch to the world of technology, specifically websites, when you request a web page, this is what is going to happen: a web browser is going to make a request to a server for data that we want to display on a web page. That server is going to find the data in a database, and return that to the web browser, as we can see below.

In this analogy, the customer is the web browser (e.g. Google Chrome, Safari, Edge, etc), the waiter is the server (sounds about the same to me), the order is the data requested and the kitchen is the database (which keeps track of available ingredients, recipes, etc).

In order to keep the customer happy, we need to ensure the time between ordering and serving the food is as quick as possible and that the order is prepared as well as it can be.

Right now, we have one customer, one waiter, and one chef in the kitchen, it’s all good. Everybody is happy and the restaurant can handle all the incoming requests.

Until…

10 more people arrive…

But wait, more people enter the restaurant all at once!

The waiter begins to take orders from everyone, dispatching them to the kitchen as fast as they can. As usual, you expect your order to be ready as soon as the kitchen can prepare and cook these delicious meals. However, when customers raise their hands to ask for something else and the waiter is busy, they have to wait a little longer.

The same thing happens in the kitchen, the chef is organised and is able to cook some dishes in parallel, but it takes longer than expected to get them back to the customers.

That’s not too much of an issue… until another 50 people walk into the restaurant!

Avoid a Kitchen Nightmare

We’ve all seen enough Gordon Ramsay's ‘Kitchen Nightmares’ to know that one waiter and an average chef won’t be able to cater to 50 people on their own. They won’t be able to serve the food in time, and many people will leave the restaurant, angry and hungry.

Guess what, it is the same for websites. They can be a nightmare of their own. If too many people request web pages at the same time (that’s you and I going onto a website at the same time and going to the homepage), the server is going to be overwhelmed and the database will not be able to process everyone’s data efficiently. As a result, the requests will time out, meaning you may see a page like this:

So what can we do to solve this problem?

Well, we generally only have two options:

This first option is to hire a better waiter, that is so skilled he can take all the orders really fast, and to get a chef like Gordon Ramsey into the kitchen, who is able to cook your dishes with maximum efficiency. In computer terms, we find better hardware, which is called “scaling vertically”. But there’s a limit to this! Firstly there’s a physical limit: there’s a point where no chef and no waiter can go faster (in theory the speed of light, in reality, way before that). And if you can afford the most skilled one, it is going to be costly.

The second option is to add more waiters. If we increase the number of waiters in the restaurant we can serve more tables at the same time. But surely, that’s just as expensive as the first option? Well, you can adapt the cost: get extras for the weekend, or hire more in the summer if your restaurant is seasonal. In short, you only hire more waiters when you have more customers to serve.

In computer science, we can also do this with our services, and we call that “scaling horizontally” this enables us to add more power when more people come in and reduce it when fewer people are around.

That’s great. Magical you say! That resolves all our issues! Why is this blog even longer?

You should note that I’ve left out the chef part in there. Why is that? Well, you can add more cooks in the kitchen, but you can only have one chef, and one supply of food. Adding more sous-chefs is just a scale-up of the kitchen, the chef still needs to inspect all plates before they go to the clients or they risk reducing their quality, and let’s be honest, nobody wants to find a hair in their soup!

That’s kind of the same with databases: they are hard to scale horizontally. We won’t get into the technical details as to why, but in short, it has the same problems as the chef: the data needs to be kept in sync so we don’t serve the wrong thing, at the wrong time, to the wrong person.

In reality, as long as the servers are well organised, even if the kitchen is fairly small, this model is going to hold for a while

“ — I want a burger!”

… “But sir, it’s an Italian restaurant”

Restaurant clients can be demanding. And the more you grow your business, the larger the menu is. How confusing will it be for a chef to have to learn every dish? Switch from a burger to a pizza? Or go from cooking a juicy steak to boiling Chinese noodles?

Usually, restaurants have an area of expertise and a limited number of things that they know how to do well. How many times has Gordon Ramsey said to a restaurant owner “Change your menu, there are too many things!”?

Another thing to be concerned about is that if a lot of people want to eat Italian, customers who perhaps want Mexican food instead, are going to see their meals delayed as the kitchen is focused on its Italian cuisine.

The solution here would be to split them into two different restaurants. If there’s more demand for Italian food, the Italian restaurant can be bigger than the other. And if there are too many people in that restaurant one day, it won’t affect clients who want to eat different kinds of food.

We have the same kind of strategy in computer science. Consider different resources that a server needs to serve. Let’s say for instance that “user profile” is a resource, and “chat” is another. Maybe you have a few profiles to load in your application, but a lot of chat data flowing through. You don’t need the same requirements for these two services you want to deliver.

If your chat is down for some reason, you would still like to see user-profiles and vice versa. The Engineers developing the chat would be better experts in their areas if they didn’t have to develop profile features.

This approach, where we split the code and run it in different servers depending upon what it needs to deliver is called a Microservice Architecture. This architecture is efficient because each team is responsible for a little piece of software that has a focused purpose, and can scale independently.

Conclusion

The last section highlights the direction more systems are tending to go down nowadays. However, anything that enables growth and scale, requires more infrastructure, tools, and process overheads in order to be truly effective.

This is why most codebases start with all code, logic, and infrastructure in the same place (otherwise known as a Monolithic Architecture) as described in the first section, in order to develop faster. Once a codebase becomes more mature and an Engineering team grows, we can apply different scaling techniques like those described above (e.g. Microservice Architecture).