Building Scalable Systems using Queues
Definition of a scalable system can vary from person to person, but according to me, a Scalable system is
- Spike tolerant
- Horizontal Scalable
- Load Balanced
Today we will discuss how can we use Queues and Event-Driven Programming to create a scalable system
The queue is a type of Data Structures that follows the access pattern of
First In first Out
What do you mean by first in first out in Software terminology?
Think about how your printer processes all the requests. What happens when more than one person wants to access the printer.
Will it interleave the requests?
The answer is no, Printer is a very good example of efficient use of Queue Data Structure.
It takes in all the requests and puts them in a printer job pool and then picks one job from the pool and starts the task.
One thing to add here is that printer will pick the jobs in the order in which they arrived at the printer(FIFO).
Now that we know what is a queue, let’s discuss how this simple thing can be used to scale huge systems.
When to use Queue
To understand this let’s take an example of a wending machine which produces biscuits at some rate x biscuits/minute and you are the consumer of these biscuits and you can consumer biscuits at the rate y biscuits/ minute where
In this scenario, you can consume all the biscuits which are produced by the machine.
What if the rate is increased by some amount z such that
Now you can not consume all the biscuits which are produced by the machine. All the extra produce is getting spilled on the floor.
Looking at your condition a wise man brought you a tray to put in all the extra biscuits which you are not able to consume till the time the machine production rate is back to normal i.e x
This plate in Software Terms is called a Queue
A queue is a bucket that stores the extra produce which in software terms in the no. of requests being made to a server because the rate at which the server can respond to these responses is limited by server capacity but the rate of production is not limited by any constraint.
System Design with Queues
Let’s see the thing we discussed previously, here our server can process 10 requests in a single time but it is bombarded with 20 requests. Some interesting questions to think about at this point are
What will be the behavior of our server?
It will start rejecting the service requests.
How to deal with this problem?
- Increase the serving power of the server(Vertically scalable)
- Increase the number of servers(Horizontal Scalable)
These solutions work very well if the load is consistent and our servers are usually getting more requests than they can serve. Also, till the time we get to know about this situation our server is in constant load which might result in refusing the requests.
But what if we want to make this process asynchronous just like you visit a bank and deposit a cheque, cashier collects all the cheques and gives acknowledgment that it will be processed and processes it later when he gets free from all the customers who need immediate services.
Similarly, a queue data structure is used in between a client and server which acts as a buffer that helps to maintain the rate at which requests are being sent to servers.
There are multiple types of queues, pull-based and push-based but discussing them is out of the scope of this blog post.
Let’s see how this will work:
Assume, 15 people want to use the printer at the same time so they sent the print command to the printer but the printer can run only one command at a time, printer won’t reject other commands but these extra commands will be kept in a queue and these commands will be picked as and when the printer completes the current command.
One more benefit of using Queues is that we can add one more printer that can pull commands from the same queue and that way one command will be picked from the queue by one of the printers.
The Real Game
We at LambdaTest heavily rely on Apache Kafka for communication between services.
No two services communicate to each other directly. Kafka acts as a medium of communication and a medium to buffer the requests between the services.
This way by using event-driven programming, we manage the loads on our service and when required we can horizontally scale up our services to handle the increased loads in peak times.
Advantages that we get using this architecture:
- Easy to scale
- The system becomes tolerant to usage spikes
- We can replay the message if required
- Events can be listened to by multiple services (Actual worker service and Analytics service to maintain the usage record)
Some cons of this approach:
- Time to develop a feature increases
- Readability of code is decreased
- Onboarding time of a new joiner on that service is increased
- Response time of the service might increase
Some other queue-based systems include
- Amazon SQS
There are many more baked solutions that are available but yes, you can create your own using databases.
I hope this helps you to take some architectural decisions.
Follow My YT Channel for Related Content