Geek Culture
Published in

Geek Culture

System Design Basics: Load balancers

An Introduction to Load Balancers for Beginners

Photo by Joshua Woroniecki on Unsplash

What is Load Balancing?

Load balancing refers to efficiently managing traffic across a set of servers, also known as server farms or server pools. Each load balancer sits between client devices and backend servers, receiving and then distributing incoming requests to any available server capable of fulfilling them.

What are Load Balancers?

A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across several backend servers. It is used to increase the concurrent capacity of a distributed system by increasing availability and performance. It improves the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks.

Load Balancers are categorised into two different groups: Layer 4 and Layer 7. Layer 4 corresponds to load balancing on a network level (Network & Transport Layer), optimising the flow of packets through protocols like IP, TCP, FTP etc. Layer 7 works on an application level, optimising HTTP requests, APIs etc.

A load balancer may be:

  • A physical device or a virtual instance running in a distributed system
  • Incorporated into application delivery controllers (ADCs) designed to more broadly improve the performance and security at the microservice level.
  • A conglomeration of several load balancers, running on different algorithms based on the use case in a system.
Load Balancer in a distributed system

Load Balancing Algorithms

Some of the algorithms used for load balancing are:

  • Round Robin: Requests are redirected to different servers sequentially in a round-robin manner (one after another sequentially)
  • Least connections: Requests are sent to the backend server with the least number of requests. The relative computing capacity is considered when deciding where the request should be sent
  • IP Hashing: A mapping of backend servers is done with the client IP address. Based on the IP of the client’s request the backend server is selected. This strategy is used where some specific servers are given preference over others.

Sticky Session

Session stickiness, or session persistence, is a mechanism by which load balancers can couple the requests to the backend systems. This ensures that different requests for the same session can be processed by different servers without loss of information.

Working of Sticky Sessions

The advantage of sticky sessions is that servers within the distributed system don’t need to interact between them. Each system can work independently. Also, there is an added advantage of RAM cache utilisation which results in better responsiveness.

But this is not without its cons. A server may become overloaded with too many sessions or might result in data loss if servers are shifter mid-session. There’s also the added latency added due to one central load balancer.

Elastic Load Balancers

An Elastic Load Balancer (ELB) can scale load balancers and applications based on real-time traffic automatically. ELB automatically distributes incoming application traffic across multiple targets and virtual appliances in one or more Availability Zones (AZs).

It uses system health checks to learn the status of application pool members (application servers) and routes traffic appropriately to available servers, manages fail-over to high availability targets, or automatically spin-up additional capacity.

ELBs scales your load balancer as traffic increases. The load balancer acts as a point of contact to all incoming requests, and monitoring the health of the instances distributes load among them.

AWS Elastic Load Balancer

Elastic Load Balancing automatically distributes incoming application traffic across multiple server instances. It enables you to achieve greater levels of fault tolerance in your applications, seamlessly providing the required amount of load balancing capacity needed to distribute application traffic.

Elastic Load Balancing detects unhealthy instances and automatically reroutes traffic to healthy instances until the unhealthy instances have been restored. Customers can enable Elastic Load Balancing within a single or multiple Availability Zones for more consistent application performance.

ELBs can be configured at three levels in a system:

  • Application Load Balancer: Application Load Balancer is best suited for load balancing at the application level (HTTP/HTTPS requests). It provides advanced request routing targeted at the delivery of modern application architectures, including microservices and containers.
Application Load Balancer
  • Network Load Balancer: This is done at Network Level. It is best suited for load balancing of Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Transport Layer Security (TLS) traffic where extreme performance is required.
Network Load Balancer
  • Gateway Load Balancer: Gateway Load Balancer makes it easy to deploy, scale, and run third-party virtual networking appliances. Providing load balancing and auto-scaling for fleets of third-party appliances, Gateway Load Balancer is transparent to the source and destination of the traffic. This capability makes it well suited for working with third-party appliances for security, network analytics, and other use cases.
Gateway Load Balancer

Congratulations on making it to the end! Feel free to talk about technology or any cool projects on Twitter, Github, Medium, LinkedIn or Instagram.

Thanks for reading!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store