Load Balancers: Traffic Police of the Internet

Sakthidharan Shridharan
Developer Community SASTRA
6 min readMay 23, 2021
Image source: Load Balancing | Google Cloud

Hello there! Let’s start with a story: You own a business, and its online platform is so amazing, such that you have a lot of incoming and outgoing traffic every time! This results in frequent server crashes. So, you increase the number of servers from one to three. Sounds great, isn’t it?

Unfortunately, something weird happens. Your application crashes as frequently as before. You check out what’s going on. It turns out, all (or most) of the incoming requests are handled by only one server, while the other two don’t do any work at all. Sounds like a “group project” in schools, right?

While having multiple servers is definitely a good thing, it turns out that there is a way to make sure that all these servers work (almost) equally.

Let’s welcome load balancers!

Image source: What Is Load Balancing? How Load Balancers Work (nginx.com)

Load balancing, in terms of networking, is the process of distribution of network or application traffic/requests across multiple servers.

A load balancer is a server that receives requests on behalf of backend servers and then distributes those incoming requests to any available server capable of fulfilling them.

In other words, it does as the title of this article says — they are the “traffic police” of the internet (but they do have a small area of control, and those are your servers).

What would a load balancer do?

Usually, a load balancer does the following stuff:-

  • Health Checks: Load balancers should only forward traffic to “healthy” backend servers. Health checks work by occasionally pinging the servers to check whether they are functioning properly. A Load balancer forwards incoming requests to a server only if it is functioning properly.
  • Distribution of the requests: Once healthy servers are identified, the load balancer is responsible for selecting a server and forwarding the incoming request. This can be achieved by using a suitable algorithm that selects the optimal server.

They can also do much more, such as preventing DDoS attacks or acting as a firewall. But their functionality is also dependent on the factor whether they are hardware-based or software-based.

Load Balancer Types: Hardware and Software

A load balancer, like parts of a computer, can be categorized as hardware-based or software-based.

Hardware-based load balancers

A hardware load balancer is a hardware device with a specialized operating system that distributes web application traffic/requests across a cluster of servers (or server farms). It can also be a specialized router or switch which is deployed in between the servers and the client.

To ensure the best performance, the hardware load balancer distributes traffic according to customized rules or algorithms to not overwhelm the servers.

These hardware-based load balancers stand between the servers and clients, to redirect/request to the servers. When clients visit the website, they are sent first to the load balancer, which then directs clients to different servers based on several factors. This hardware comes with pre-set configurations and requires expertise for modification.

A network load balancer hardware is typically over-provisioned — in other words, their capacity is buffed up, they are sized to handle occasional peak traffic. However, if the traffic reaches the peak capacity, it cannot be scaled up to meet the demands.

Software-based load balancers

On the other hand, software load balancers are simply installed on general servers or virtual machines. They work in layer 7 (Application layer) of the OSI model.

They provide flexibility, which is not found on hardware load balancers, easily adjusting to the needs of your business. They are also easily accessible and do not require specific hardware.

Whether network traffic is low or high, software load balancers can simply auto-scale in real-time, eliminating over-provisioning costs and worries about unexpected traffic surges.

Hardware appliances are not compatible with cloud environments, whereas software load balancers are compatible with bare metal, virtual containers, and cloud platforms.

Usually, L4 load balancers are hardware-based, while L7 load balancers are software-based.

L4 and L7 - What do these numbers mean?

To answer the above question, we must treat the internet as a seven-layer sandwich. Yes, you heard it right. According to the Open Systems Interconnection (aka OSI) model, communication between two computers can be divided into multiple functions, which are grouped into 7 layers, as given in the illustration below:

Image source: What is the OSI Model? | Cloudflare

As we can see, Layer 4 is the transport layer, and Layer 7 is the application layer. An L4 load balancer acts on the transport layer protocols (such as TCP, UDP), while an L7 load balancer acts on the application layer protocols (such as HTTP).

L4 load balancers

L4 load balancer works on Layer 4 (known as the transport layer) of the OSI model.

When a client makes a request, a TCP connection with the load balancer is created. The TCP connection created by the client is retained by the load balancer, to connect with one of the upstream servers.

L4 load balancers do not check the contents of the packets and forward them upstream or downstream with minimal information obtained (source and destination addresses). Such a load balancer uses elementary algorithms for the same.

L7 load balancers

L7 load balancer as the name suggests works on Layer 7 (Application Layer) of the OSI model.

When a client makes a request, a TCP connection with the load balancer is created. The load balancer then either redirects the client to one of the upstream servers or creates a new TCP connection with one of the upstream servers.

In the application layer, the data inside each packet is also visible, and the selection of the server can be done by inspection of the packet contents. Hence, the load balancer can make more complex decisions, such as checking the headers for authorization, routing based on the URL, and routing based on the client can be done.

As such, an L7 load balancer algorithm can afford a complex algorithm to select an optimal server to forward each request.

Which server to select? Let the algorithms do that!

Until now, we understood what a load balancer does. But how does it select the best server to handle a request, every time it receives one? To answer these questions, there exist some algorithms, which do the job. Let us discuss some of the most used algorithms here.

Some of the prominent algorithms for load balancing are given below:-

  • Round Robin
  • IP Hashing
  • Least Connections
  • Least Resources Used (CPU/RAM)

Round Robin

This algorithm is one of the simplest approaches used in load balancing. Here, the first request is directed to the first server, and the successive requests are sent to the successive servers. If all the servers are given requests, then the next request goes to the first server again and the process is repeated.

Weighted Round Robin

The above algorithm assumes that the servers have equal capacity. However, if the capacity varies, then the above algorithm can be modified to give preference for servers with higher capacities.

IP Hashing

In the IP Hashing algorithm, the IP Address of the client is used to select a server. Hence, in this algorithm, incoming requests from the same client will be handled by the same server every time. Here, the source (and occasionally the destination’s IP address) is sent to a hashing function, which performs a mathematical computation to determine the server.

Least Connections Method

In this algorithm, the load balancer maintains a count of the number of concurrent connections for each server. The server with the least number of connections is selected.

Least Resources Used

In this algorithm, a metric of each server is taken (such as memory usage, CPU usage) and the choice is taken based on these metrics that is, which server utilizes the least resources. This algorithm ensures equal usage of resources by the servers.

There are many more algorithms, but they do the same thing in the end, select a server for the request, much like the Sorting Hat in Hogwarts.

Load balancers are cool. But how to get one?

A dedicated server can be configured using popular web server software, such as NGINX, to work as a load balancer.

For your cloud-based solutions, cloud service providers (such as Google Cloud Platform, Amazon Web Services) provide load balancing as a service.

Check out the below link for more information on GCP’s load balancing service:-

Load Balancing | Google Cloud

And that concludes this small “chit chat” on load balancers.

--

--

Sakthidharan Shridharan
Developer Community SASTRA

An undergrad CSE student, with a passion for logic, algorithms and “techy” topics. Have an ardent desire to teach, and learn new stuff!