Load Balancing 101

Published in

The KickStarter

4 min readAug 24, 2020

Load balancers distribute traffic efficiently across multiple servers which decouples the overall health of a backend service from the health of a single server to ensure that your services stay online. In layman terms, a load balancer is either a piece of hardware or software that works like a traffic cop for requests.

Load balancers are key to building distributed systems because they give applications the following benefits:

Redundancy — eliminate failures when one server dies; as long as there is at least one application server left, the load balancer can still direct client traffic to the remaining working application server
Scalability — easier to increase or decrease servers based on site’s traffic

Introducing load balancing can improve performance and eliminate downtime, and efficiently manage failures.

Types of Load Balancers

Now that we know what load balancers are and why we need them, let’s discuss the types of load balancers. There are several specific types of load balancers, you might need to consider for your network. Load balancers can be broadly classified into the following:

Network Load Balancing

The Network Load Balancing (NLB) feature distributes traffic across several servers by using the TCP/IP networking protocol. By combining two or more computers that are running applications into a single virtual cluster, NLB provides reliability and performance for web servers and other mission-critical servers.

Network load balancing is considered the fastest type of load balancing. However, it tends to fall short when it comes to balancing the distribution of traffic across servers.

2. HTTP(S) Load Balancing

One of the oldest forms of load balancing is the HTTP(S) load balancing. Contrary to the Network Load Balancer, this balancer operates in the application layer as it relies on network layer 7. HTTP load balancing is often dubbed the most flexible load balancer because it allows you to form distribution decisions based on any information that comes with an HTTP address.

3. Internal Load Balancing

Internal load balancing is identical to network load balancing but can be leveraged to balance internal infrastructure. While external load balancers are used to redirect external traffic to the cluster, the internal load balancer is used for internal service discovery and load balancing within the cluster.

Load balancing algorithms

round robin load balancing algorithm explained — Round Robin Load Balancing Algorithm

>> Random

Random assignment is by far the least organized load balancing algorithm, and does exactly what it says. It randomly assigns workload to each server. The underlying algorithm is a random number generator. Random assignment sounds more complicated than it actually is. It generally works well in clusters where nodes have similar configurations of CPU, RAM, etcetera.

>> Round Robin ( and Weighted Round Robin )

Round-robin load balancing is one of the simplest and most frequently used load balancing algorithms. Requests are distributed to application servers in rotation. Weighted Round Robin is a slightly improvised algorithm where system administrators assign a weight to each server based on its structure and efficiency to reflect it’s traffic-handling capability.

Although the weighted algorithm distributes the workload more evenly by taking the computing and load handling characteristics of the applications servers into account, issues arise when length or processing demands of connections are accounted for. With varying connection lengths and traffic, some servers may accumulate most of the traffic and cause unequal distribution of traffic.

>> Least connections ( and Weighted Least Connections )

Least Connection is a dynamic load balancing algorithm where client requests are directed to the application server with the least number of active connections at the time the client request is received. This algorithm is most useful when there are persistent connections. Similar to the Weighted Round Robin algorithm, the Weighted Least Connections algorithm adds a weight based on the respective capacities of each server. Now, the algorithm distributes the load based on the weights/capacities of each server and the current number of active clients on each server.

>> Source IP hash

The Source IP hash algorithm generates a hash key using the source and destination IP address of the client and server. This key determines which server receives the request. Since a key can be regenerated after a session is broken, it is useful for clients that should connect to a session that is still active after a disconnection. IP hashing can be incredibly useful but there’s a catch. When there is a ton of traffic from the same IP addresses, single servers could be overloaded.

>> URL hashing

URL Hash is a load-balancing algorithm to distribute writes uniformly across multiple sites and sends all reads to the site owning the object.

*Other load balancing algorithms are the Least Response Time and the Least Bandwidth.

As with many things at scale, load balancing sounds simple on the surface but can get sophisticated and complex based on the use case. Now it’s time to get our hands dirty by implementing a load balancer using common load balancers like HAProxy and Nginx

Load balance early and load balance often ⚖️

—

some resources I found super useful while learning about this topic and writing this article: https://kemptechnologies.com/ca/, https://www.nginx.com/resources/glossary/load-balancing/, and https://www.dnsstuff.com/what-is-server-load-balancing