Load Balancing And Rate Limiting for Dummies (Part 1)

Lakshay Bhambri
5 min readFeb 16, 2020

--

As always, I would like to start my knowledge dumping with a short story first. Last week my mentor shot a task on me to induce a rate limiting strategy on my API. I knew about this thing from various indirect scenarios from my past but never really dig deeper into the subject. Google was my saviour always :P . But, situation got tricky when he asked to not only add the feature, but explain its end-to-end flow to him. That’s when I started reading about rate limiting in HAProxy. But as I went deeper, I understood that the time has come to make myself clear on both load balancing and rate limiting. Spent some time to understand basics and then went on to accomplish my task. Enough of my shit, let me now share my learnings with everyone.

The term, load balancing, itself opened a whole new can of worms.

For all the dummies out there, let’s get some clarity over the context first.
Gone are the days, when world was small and peaceful place and fewer people were out there on internet. Systems/Server did not need to care for large traffic on their websites. But today, from serious scientists presenting their works in realtime on a video conference, to even our parents, the complete analog generation, feels entitled to share their latest trip stories on Instagram or WhatsApp. So, systems are working overtime to keep everyone satisfied. We have now reached to point where one server cannot handle load of all these people’s requests.

Enters multiple instances/replicas of same applications to share load between themselves.

Team Work — Old but still gold idea

But wait. Who’s to decide which one gets which request ??

Enters the great load balancers.

You might now intuitively guess what are they here for ? to balance and wisely distribute the traffic between all these application instances. Here’s a more formal definition to load balancers,

A load balancer is a piece of hardware (or virtual hardware i.e. software application) that acts like a reverse proxy to distribute network and/or application traffic across different servers.

For all those dummies out there, reverse proxy is just a fancy term for a hardware/software component which on the server end to route incoming traffic to appropriate service. Forward proxy, on the other hand sits on the client side or internet to route traffic or bypass firewalls.

Now, as like old school classes, the very next topic of discussion will be the types of load balancers present today.

On a crude level infra/hardware level, there are basically two types of load balancers:

  • Hardware/Bare Metal Load Balancers
  • Software Load Balancers

Let me give you a brief idea about hardware load balancers and throw it out of our way ASAP.
The requirement for load balancing came way too early, even before our existing operating systems/kernels were not strong enough to handle these large scale traffic loads. Hence, the industry came up with tailor made hardware to support the requirement of a dedicated load balancing machine. Today, our systems are strong enough to even cater the requirement of large scale virtualisation also. So these kind of machines are slowly being phased out of the infrastructure of various organisations.

End of story.

Talking about software load balancers, they are also further categorised into 3 types:

  • Link Level load Balancers (L1/L2/L3 Load Balancers)
  • Network Level Load Balancers (L4 Load Balancers)
  • Application Level Load Balancers (L7 Load Balancers)

Dummies, confused ?? Let me help.

To understand the concept of level, we will have to get a refresher on the old school OSI 7-layered network communication model.

To revise this old forgotten shit, you can visit this link.

Once you are clear with that shit, I think we are good to go. Time to explain each one of them separately.

Link Level load balancers, decides what network link to send our data packets to.

Network Level load balancers, decides what route/path our data packets should take to reach their destination.

Application Level load balancers, the latest one, decides which server should cater our request.

We are going to put our focus on the last two for now.

Network Level Load Balancing

This type of load balancing usually involves TCP/IP layer. Requests, irrespective of their content, are forwarded in any predefined manner to destination service/application. We will talk about various kinds of those “predefined manner” later.

I know!!! But can’t help you dummies out there.

Talking strictly in terms of HAProxy, it support both network level and application level load balancing.

The idea behind the use of load balancing at this level originates from the fact that it’s not only the user facing web services which needs load balancing. Several internal services which faces high volume of requests like a MySQL DB instance, worker pools, etc which are non-HTTP in nature, too require load balancing to handle all the incoming requests effectively.

There’s one more interesting finding which I must mention here. Under this umbrella term, there exists another kind of load balancing which occurs at DNS level. Although, theoretically speaking, DNS comes under application layer, but this kind of load balancing is generally grouped under this flag.

What happens in DNS load balancing is pretty interesting. Whenever a DNS server receives a request to resolves the ip address for the the specific hostname, the DNS server return list of the IP addresses in a round robin manner for that host name. In this way, the initial balancing of the the traffic saves the use of balancing at a server level.

DNS Load Balancing

Application Level Load Balancers

Ah, finally, talking about something devs can relate to. “Applications”

As per my understanding on the subject till now, load balancing at this level involves HTTP requests mainly. In layman terms, this is where you decide what request must be sent to which server based on its content like cookies, headers, TLS versions, etc. Simply means, content-based load balancing.

This load balancing is the one which we are going to dig deeper using HAProxy in my upcoming posts.

Lastly, we need to discuss the types of those “predefined manners” of load balancing I stated above. Formally, we should call them load balancing algorithms. Loosely, there are 4 basic algorithms for load balancing.

  • Least Connection Method — directs traffic to the server with the fewest active connections. Most useful when there are a large number of persistent connections in the traffic unevenly distributed between the servers.
  • Least Response Time Method — directs traffic to the server with the fewest active connections and the lowest average response time.
  • Round Robin Method — rotates servers by directing traffic to the first available server and then moves that server to the bottom of the queue. Most useful when servers are of equal specification and there are not many persistent connections.
  • IP Hash — the IP address of the client determines which server receives the request.

In my next post, we are going to discuss load balancing in HAProxy in detail.

--

--

No responses yet