Load balancer at your fingertips (Golang)

Khoa Pham
5 min readDec 5, 2018

Load balancer (LB), as its name suggests, is a gateway that distributes incoming traffic to backends. Load balancer in this post refers to application layer (layer 7) LB.

To compare with layer 4 LB in layman’s terms, it routes request:

Source: ntu.edu.sg

not network packet:

Source: xerocrypt

Goal:

Build a LB capable of balancing HTTP requests to multiple backends

Extra:

  • Auto-discover backend services using DNS SRV records (common technique in container world)
  • Allocate different load following SRV priority and weight
  • Pooling TCP connections to backends (reusable)
  • Encrypt public traffic with TLS (HTTPS)
  • For fun: benchmark it against other popular LBs (in Go)

Building:

LB technically is a reverse proxy with dynamic *network* scheduler. So let’s build the reverse proxy first:

l, _ := net.ListenTCP("tcp", addr)for {
conn, _ := ln.Accept()
go proxy(conn)
}
...func proxy(src net.Conn) {
// setup reader
for {
// read the request
dst, _ := net.DialTCP("tcp", nil, baddr)
go io.Copy(dst, src)
go io.Copy(src, dst)
}
}

There are way more to it, but that should be the skeleton of a proxy in Go.

Resolving:

GET /user/actionX HTTP/1.1
Host: localhost:8090

Either using hostname or URI, we must match it with a backend. For dynamic resolving, we will rely on SRV records (they carry name, port and weight - useful for microservice orchestrating).

// Matching service name function
type Matcher func(uri, host []byte) string

Also these DNS records might be changed (new container is deployed, old container gets shutdown), we will put in a ticker and span out those lookup (updates) on every tick.

func (s *Scheduler) relookupEvery(d time.Duration) {
ticker := time.NewTicker(d)
defer ticker.Stop()
for {
select {
case <-ticker.C:
...
for _, service := range services {
go lookup(service)
}
}
}
}

SRV record example:

_sip._tcp.example.com. 86400 IN SRV 0 5 5060 sipserver.example.com.

To route request fairly, we will need a scheduler. The most common technique is assigning request to each backend in turn, like a round-robin tournament. To advance it a bit, our scheduler also need to respect DNS SRV priority and weight (to distribute different load to different backend):

type (
Scheduler struct {
sync.Mutex
backends map[string]*queue
}
backend struct {
target string
priority uint
weight uint
}
queue []backend
)

func (q queue) Len() int { ... }
func (q queue) Less(i, j int) bool { ... }
func (q queue) Swap(i, j int) { ...}
func (q *queue) Push(x interface{}) {
*q = append(*q, x.(backend))
}
func (q *queue) Pop() interface{} {
old := *q
n := len(old)
x := old[n-1]
*q = old[0 : n-1]
return x
}
func (s *Scheduler) NextBackend(name string) string {
....
b = heap.Pop(q).(backend)
return b.target
}

We utilize std heap (which is a priority queue) to do scheduling logic - priority for earliness, weight for load. This is how it results:

+---------+----------+--------+
| backend | priority | weight |
+---------+----------+--------+
| b1 | 0 | 30 |
| b2 | 0 | 20 |
| b3 | 10 | 40 |
+---------+----------+--------+
// for every 9 requests+----+----+----+----+----+----+----+----+----+
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+----+----+----+----+----+----+----+----+----+
| b1 | b2 | b1 | b2 | b1 | b3 | b3 | b3 | b3 |
+----+----+----+----+----+----+----+----+----+

TCP connection is not free, especially for LB. Proxying a connection on LB needs to maintain at least two sockets, one to client, one to backend. Opening and closing those sockets at load are resource intensive, so keeping them alive is a smarter choice. Let’s pool those connections to backends:

type Proxy struct {
sync.Mutex
conns map[string]map[*tcpConn]struct{}
}
func (p *Proxy) get(saddr string) *tcpConn {
p.Lock()
defer p.Unlock()
if pool, ok := p.conns[saddr]; ok {
for c := range pool {
if c.busy {
continue
}
return c
}
}
return nil
}
func (p *Proxy) open(addr *net.TCPAddr) *tcpConn {
saddr := addr.String()
c := p.get(saddr)
if c != nil {
return c
}
...
}

Traffic encryption sounds hell lot of work, but with Go std tls package, we can make our LB works with TLS protocol almost immediately. I would call this a switch (3 SLOCs ❤️):

...
crt, _ := tls.X509KeyPair(cert, key)
cfg := &tls.Config{Certificates: []tls.Certificate{crt}}
l, _ := net.ListenTCP("tcp", addr)
ln := tls.NewListener(l, cfg)
for {
conn, e := ln.Accept()
...
}

Benchmark:

Let’s see how it performs against a very popular Go LB: traefik

Running 1m test @ http://unload.local:8090/bench
20 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 183.18ms 99.59ms 826.08ms 71.50%
Req/Sec 273.77 101.69 1.20k 79.53%
Latency Distribution
50% 223.54ms
75% 241.33ms
90% 267.15ms
99% 325.42ms
325072 requests in 1.00m, 39.06MB read
Socket errors: connect 0, read 53, write 0, timeout 0
Non-2xx or 3xx responses: 19
Requests/sec: 5409.40
Transfer/sec: 665.60KB
// traefik:
Running 1m test @ http://test.traefik:8000/bench
20 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 181.26ms 147.99ms 809.41ms 50.72%
Req/Sec 291.67 89.69 0.97k 72.41%
Latency Distribution
50% 199.24ms
75% 291.60ms
90% 370.37ms
99% 538.59ms
345813 requests in 1.00m, 33.64MB read
Requests/sec: 5754.70
Transfer/sec: 573.22KB

Numbers are not bad, refinements are needed but we are not too far off…

GO-tchas:

io.Copy on io.Conn is blocked, so is your routine. Even when src is closed, the routine just hangs there indefinitely:

go io.Copy(src, dst) // blocked until dst is signaled done

SetReadDeadline to the rescue - indirectly though:

errc := make(chan error, 1)
go cp(dst, src, errc)
go cp(src, dst, errc)
<- errc
dst.SetReadDeadline(time.Now()) // to break second routine
...
func cp(dst io.Writer, src io.Reader, result chan error) {
_, err := io.Copy(dst, src)
result <- err
}
...
// before reusing dst:
dst.SetReadDeadline(time.Time{})

Working with IO stream in Go is fast and practical, even with complication like a LB. The part I enjoy most is the behaviors of any io.Conn types can be extended through composition. All we need is an io.Conn field (representing all those TCP conn, UDP conn or TLS conn). Neat!

The full source code is here: https://github.com/owlwalks/unload

Let me know what you think in comments section below.

Happy coding Gophers!

--

--