Fundamentals of Load Balancing (System Design Series | Part 1)

5 min readFeb 17, 2024

Load balancing in the context of distributed systems refers to the strategic distribution of incoming network traffic, data, or computational workload across multiple servers or resources within the system. The primary goal is to optimize resource utilization, prevent individual servers from becoming overloaded, and ensure efficient and reliable performance of the entire system. In this post, we will dive into the significance of load balancing in designing a more resilient and responsive infrastructure.

Definitions

Before delving into the actual concept of load balancing, let’s clarify some definitions to ensure that we are on the same page:

Distributed System:
A distributed system is an environment in which multiple interconnected computers work together as a single, unified system. A distributed system distributes the workload across multiple nodes or machines. These nodes communicate and coordinate with each other to achieve a common goal.

Resource Partitioning:
Resource partitioning involves dividing the available resources, such as computing power, memory, and network bandwidth, among different components or servers within a distributed system.

Vertical Scaling:
Vertical scaling, also known as scaling up, involves increasing the capacity of individual servers by adding more resources, such as CPU, RAM, or storage, to meet growing demands.

Horizontal Scaling:
Horizontal scaling, or scaling out, involves adding more servers to a distributed system to handle increased workloads.

What is Load Balancing?

In a distributed system, various servers work collaboratively to handle user requests, process data, or perform computations. However, due to varying workloads, hardware capabilities, and potential failures, not all servers may contribute equally to the overall system’s performance. Load balancing addresses these challenges by intelligently allocating tasks or requests to different servers, thereby distributing the load evenly and preventing any single server from becoming a bottleneck.

When To Use Load Balancing?

In a scenario where a resource-partitioned distributed system chose for horizontal scaling by adding more servers, the introduction of a load balancer becomes essential in optimizing the performance and resource utilization of the entire infrastructure. Consider a resource server, dedicated to serving specific data, executing computing processes, or handling specific functionalities. A single server might struggle to respond to a large volume of user requests, causing the need for additional servers within a scale-out architecture. Also, multiple resource servers are introduced to collectively address the increased workload. By implementing load balancing, it becomes possible to ensure an even distribution of workload across the nodes in our distributed system, facilitating uniform resource consumption across the infrastructure.

Why Do We Need Something Like This?

Load balancing is a crucial component in distributed systems, serving various essential purposes that contribute to the overall efficiency and robustness of the infrastructure.

Availability:
Load balancing ensures high availability by distributing incoming traffic across multiple servers. In the event of a server failure or maintenance, the load balancer redirects traffic to healthy servers, preventing downtime and ensuring continuous service availability.

Scalability:
As a system experiences increased demand, load balancing becomes instrumental in achieving scalability. By distributing the workload evenly among servers, it allows the infrastructure to scale horizontally, adding more servers as needed. This ensures that the system can handle a growing number of users or transactions without compromising performance.

Reliability:
Load balancing enhances the reliability of a distributed system by preventing individual servers from being overloaded. It optimally allocates resources, mitigating the risk of bottlenecks and maintaining consistent performance across the entire infrastructure.

Redundancy:
Load balancing introduces redundancy by distributing traffic across multiple servers. In the case of server failures, the load balancer redirects traffic to other available servers, preventing a single point of failure. This redundancy enhances the overall reliability and resilience of the system.

Types of Load Balancers

Hardware Load Balancers:
Hardware load balancers are dedicated physical devices designed specifically for load balancing tasks. These devices operate at high speeds and are equipped with specialized hardware components optimized for efficiently handling network traffic. They often include features such as SSL termination, caching, and content compression. Hardware load balancers are ideal for high-performance scenarios where dedicated hardware resources are critical for achieving low-latency and high throughput. However, they can be more expensive and less flexible compared to software counterparts.

Software Load Balancers:
Software load balancers, on the other hand, are applications or services that run on general-purpose servers. These load balancers leverage the server’s processing power to flexibility and scalability, allowing for easy integration with other software components and dynamic adjustments to changing workloads. While they may not match the raw performance of dedicated hardware, advancements in software load balancing technologies have closed the gap, making them increasingly popular for many use cases.

Layer 4 (Transport Layer) Load Balancing:
Layer 4 load balancing operates at the Transport Layer of the OSI model, focusing on network-level information such as IP addresses and port numbers. This type of load balancing is well-suited for distributing traffic based on factors like server availability, connection persistence, and load metrics. Layer 4 load balancers make routing decisions without inspecting the content of the actual data packets, providing a faster and more streamlined approach to load balancing. They are effective for scenarios where the application layer details are less critical, and efficient traffic distribution is the primary goal.

Layer 7 (Application Layer) Load Balancing:
Layer 7 load balancing occurs at the Application Layer of the OSI model, allowing load balancers to make routing decisions based on application-level data. This includes details such as HTTP headers, URL paths, and content characteristics. Layer 7 load balancers offer more control over traffic distribution, enabling intelligent routing based on the specific needs of applications. This makes them suitable for scenarios where content-based decisions, such as routing requests to different back-end servers based on the type of content requested, are crucial. While Layer 7 load balancing introduces additional processing overhead compared to Layer 4, the added application-awareness can lead to more sophisticated load balancing strategies.

Load Balancing Algorithms

Load balancing algorithms are essential components in distributed systems, responsible for distributing incoming network traffic or workload across multiple servers. Each algorithm has its own characteristics, strengths, and weaknesses. Here’s an overview of different load balancing algorithms:

Round Robin Load Balancing: Requests are distributed to servers in a cyclic order.

Least Connections Load Balancing: Routes traffic to the server with the fewest active connections.

Weighted Least Connections Load Balancing: Assigns weights to servers based on capacities.

Least Response Time Load Balancing: Routes traffic to the server with the lowest response time.

IP Hash Load Balancing: Uses the client’s IP address to determine which server should handle the request.

Random Load Balancing: Randomly selects a server from the pool to handle each request.

Conclusion

In essence, load balancing strategically distributes workloads across servers to optimize performance. The decision between hardware or software load balancers and choosing algorithms involves careful analysis and cost considerations.

Stay tuned for deeper insights into implementation details, best practices, and emerging trends in the upcoming parts of this series. If you found this information helpful, consider clapping, subscribing and supporting me through buymeacoffee.