Building Scalable Go Microservices

Scale stateless and stateful services with HAProxy

Dmytro Misik
20 min readSep 21, 2023

One of web applications’ most critical non-functional requirements is that they should be scalable. Scalability is an application’s ability to handle an increased load without a decrease in performance or availability. And the primary goal of the business is to scale to earn more money. So, nowadays, it’s required to build applications able to scale. But sometimes, it takes a lot of work. In this post, I will describe how to scale different web applications.

Scaling Types

There are two scaling types:

  • Vertical scaling (or “scaling up”) — adding more resources (CPU, memory, bandwidth, or storage) to the existing servers;
  • Horizontal scaling (or “scaling out”) — adding more servers to distribute the traffic.

Vertical Scaling

It’s the easiest way to scale up your application. You add more computing or/and storage resources based on the needs. The application logic doesn’t care about machine resources, so nothing special is needed to perform scaling up. Every application can scale up. But this type of scaling is not enough. There are several reasons why:

  • Machine resources are limited, so you won’t be able to scale up endlessly;
  • Powerful machines cost a lot of money;
  • In case your application is deployed to a single machine, in case this machine goes down, your application becomes unavailable;
  • It’s hard to deploy the application to a single machine without downtime;
  • Machine maintenance is impossible without application downtime.

You can’t also name your application “scalable” if it can scale only vertically. As you can see, applications deployed to a single machine cannot be highly available. From the very beginning, you need to be able to host your application on at least two machines. To be able to do it, your application should support horizontal scalability.

Horizontal Scaling

The ability for web applications to scale out is crucial, but it comes with more complexity. First, if your application is hosted on several machines, traffic between these machines should be balanced. It’s possible to traffic balance on the client and the server side.

Client-side traffic balancing:

  • Domain Name System (DNS) — you can use multiple IP addresses for a single domain name and balance traffic between these IPs in different algorithms (round-robin is most popular);
  • Client-side — client needs to know server addresses and balance traffic between them by itself. It usually involves a library developed by server application owners that clients can use. Clients won’t implement traffic balancing on their own.

Server-side traffic balancing:

  • Load balancer — a server that distributes incoming traffic across multiple servers. There are hardware-based and software-based load balancers. Hardware-based load balancing involves using a physical device to spread incoming traffic. Software-based load balancing involves using software to spread incoming traffic;
  • Content Delivery Network (CDN) — a network of servers distributed worldwide designed to deliver content to users from the server closest to them. CDN can distribute traffic between multiple servers and cache content to improve user experience and performance.

It depends on the application you’re developing and your balancing strategy. The most popular technique for microservices involves using load balancers because client interaction with the server looks like interaction with a single machine by a single domain name. You can scale up and down target applications without changes in clients.

Hardware-based load balancers involve changes in the network on the physical level that are not possible when you host your application in the cloud. So, the popular option involves using software-based load balancers. The most popular software-based load balancers are:

  • HAProxy — free, open-source, reliable, high-performance TCP/HTTP load balancer;
  • NGINX — a web server and reverse proxy that can be used as a load balancer;
  • Apache HTTP Server — a web server that can be used as a load balancer;

Now, let’s dive deep into how to scale different applications. But first, let’s start with the two types of web applications.

Web Application State

In scope of state, web applications can be stateless or stateful.

A stateless web application is an application that does not store data on the server between requests. Usually, data is stored in the external database, or each request contains all the information necessary to process the request.

A stateful web application is an application that stores data on the server. Data is stored in the machine storage or cached in the memory.

Different development approaches exist to build scalable-stateless and scalable-stateful web applications. Let’s build simple key-value storage that provides HTTP API for CRUD operations. But first, let’s use an external database to store data (build a stateless application), and then, build a database inside our application (build a stateful application).

Scalable Stateless Web Application

You can use many databases to build your web applications: relational, NoSQL, in-memory, columnar, time-series databases, etc. They all have their pros&cons for different scenarios. I’ll use one of the most popular key-value in-memory databases, named Redis, to build simple key-value storage.

Redis supports three operations that can be used to develop current applications:

  • GET key — get the value of key;
  • SET key value — set key to hold the string value.
  • DEL key — removes the specified key.

I will host a web server using the popular Go web framework Fiber. Let’s write an application code:

package main

import (
"context"
"os"

"github.com/gofiber/fiber/v2"
"github.com/redis/go-redis/v9"
)

type KeyValue struct {
Key string `json:"key"`
Value string `json:"value"`
}

func main() {
app := fiber.New()

addr := os.Getenv("REDIS_ADDR")
if addr == "" {
addr = "localhost:6379"
}

rdb := redis.NewClient(&redis.Options{
Addr: addr,
})

_, err := rdb.Ping(context.Background()).Result()
if err != nil {
panic(err)
}

app.Get("/health", func(c *fiber.Ctx) error {
return c.SendStatus(fiber.StatusOK)
})

app.Get("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
val, err := rdb.Get(c.Context(), key).Result()
if err != nil {
return c.Status(fiber.StatusNotFound).JSON(fiber.Map{
"error": "Key not found",
})
}
return c.JSON(KeyValue{Key: key, Value: val})
})

app.Post("/key", func(c *fiber.Ctx) error {
var kv KeyValue
if err := c.BodyParser(&kv); err != nil {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{
"error": "Invalid request body",
})
}
err = rdb.Set(c.Context(), kv.Key, kv.Value, 0).Err()
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to set key-value pair",
})
}
return c.JSON(kv)
})

app.Put("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
var kv KeyValue
if err := c.BodyParser(&kv); err != nil {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{
"error": "Invalid request body",
})
}
err = rdb.Set(c.Context(), key, kv.Value, 0).Err()
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to update key-value pair",
})
}
return c.JSON(KeyValue{Key: key, Value: kv.Value})
})

app.Delete("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
err := rdb.Del(c.Context(), key).Err()
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to delete key-value pair",
})
}
return c.SendStatus(fiber.StatusNoContent)
})

if err := app.Listen(":4000"); err != nil {
panic(err)
}
}

Application has next API:

  • GET /health — health check;
  • GET /key/:key — returns key-value pair if key exists in Redis. Otherwise — Not Found;
  • POST /key — saves key-value pair to Redis;
  • PUT /key/:key — updates key-value pair if it exists. Otherwise — saves key-value pair. Logic is the same as in previous method;
  • DELETE /key/:key — deletes key-value pair from Redis.

What happens if you start several copies of your application (I will call it “cluster”)? All of them will use the same Redis server (it might be a cluster as well) to store/retrieve data. The only remaining question is how to balance traffic between the cluster nodes. HAProxy can solve this issue.
Let’s start with several copies of our application with Docker Compose. But first, let’s define Dockerfile for the Go application:

FROM golang:1.21

WORKDIR /app

COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN go build -o /out/app main.go

WORKDIR /out
CMD ["./app"]

Now, let’s define compose file with three instances of Go web server:

version: '3.9'

services:
app1:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

app2:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

app3:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

redis:
image: redis:7
volumes:
- redis-data:/data
expose:
- '6379'

volumes:
redis-data:

Now let’s balance traffic between cluster. To do it, let’s define HAProxy configuration:

defaults
mode http
timeout connect 5000
timeout client 50000
timeout server 50000

frontend http-in
bind *:80
default_backend app

backend app
balance roundrobin
option httpchk GET /health
server app1 app1:4000 check
server app2 app2:4000 check
server app3 app3:4000 check

This configuration defines three sections:

  • defaults — default settings for all sections;
  • frontend — defines how incoming requests are handled;
  • backend —defines how requests are forwarded to servers.

So, HAProxy will listen to 80 HTTP port and balance traffic between cluster nodes. Also, HAProxy will perform periodic health checks, and if the server does not respond successfully to the health checks, it will be removed from the balancer. Health checks are done to exclude unhealthy or unavailable instances. Now let’s update the compose definition:

version: '3.9'

services:
app1:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

app2:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

app3:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
REDIS_ADDR: redis:6379
expose:
- '4000'
depends_on:
- redis

haproxy:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- '8081:80'
depends_on:
- app1
- app2
- app3

redis:
image: redis:7
volumes:
- redis-data:/data
expose:
- '6379'

volumes:
redis-data:

To start everything, execute next command:

docker compose up -d

Now, if you try to execute the next request, you will go through HAPRoxy to one of the instances of application:

curl --location 'localhost:8081/key' \
--header 'Content-Type: application/json' \
--data '{
"key": "hello",
"value": "world"
}'

Now try to stop app1 with next command:

docker compose stop app1

Check HAProxy logs in few seconds with next command:

docker compose logs haproxy

You will see something like that:

go-stateless-service-haproxy-1  | [WARNING]  (8) : Server app/app1 is DOWN, reason: Layer4 timeout, check duration: 2003ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

HAProxy detected an unavailable server with the health check and removed it from the balancer. It means your requests will never go to app1 through HAPRoxy until it is dead. Now start app1:

docker compose start app1

In few seconds you will see in HAProxy logs next:

go-stateless-service-haproxy-1  | [WARNING]  (8) : Server app/app1 is UP, reason: Layer7 check passed, code: 200, check duration: 0ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

It means HAProxy detected healthy app1 and returned it to the load balancer.

The state is stored by Redis. You have no issues with horizontal application scaling until all the state is stored in the external database.

Let’s see what’s the difference with stateful application.

Scalable Stateful Web Application

Because the state is stored on the runtime machine storage or in memory, scaling a stateful application requires replicating this state across all the machines. And there are a lot of questions you can ask:

  • What algorithms are used to replicate the state across the cluster?
  • If the user saves some data to machine 1 and performs a GET request immediately after it to machine 2, what response will he see?
  • How would the conflict be resolved if user 1 saved pair key-value1 to machine 1 and user 2 saved pair key-value2 to machine 2 simultaneously?
  • and so on…

So let’s start from the beginning.

Several cluster configurations are used to coordinate the distributed system:

  • Single-Leader — A single node in a cluster acts as a leader node and accepts all write requests. Requests are replicated across the cluster nodes. This configuration is used in most relational databases because it ensures consistency, but the leader can become a bottleneck if it becomes overloaded;
  • Multi-Leader — several nodes in the cluster can act as a leader. Each leader is responsible for a subset of the data. Writes are sent to the appropriate leader and then replicated across the cluster. This configuration improves availability and performance, but conflicts are possible;
  • Leaderless — there is no designated leader and all nodes are equal. As well it improves availability and performance, but conflicts are possible.

Consensus algorithms ensure all nodes in the distributed system agree on a single state. The most popular consensus algorithm is Raft. If you want some details on how it works, I recommend reading this article:

The process of finding servers in a cluster is called service discovery. You can use dynamic or static service discovery. Static configuration is provided when the service is started with a configuration file or environment variables. Dynamic configuration involves using a service registry to discover services automatically. The most popular service registries are Consul, etcd and ZooKeeper.

Next, we need to talk about the CAP theorem. CAP theorem (either Consistent or Available when Partitioned) is a concept that achieving all these three states in distributed systems is impossible. You can achieve only two: consistent and partition tolerant when sacrificing availability or highly available and partition tolerant but eventually consistent. In most systems, it’s better to be highly available than consistent, so AP (Available and Partition tolerant) is preferable to CP (Consistent and Partition tolerant), but not always. For financial systems, consistency is the main requirement.

So, let’s implement the same key-value storage, but this time without Redis. Inside the application, let’s implement the Raft protocol by using the HashiCorp library:

A web application is going to be eventually consistent with single leader, so it’s possible to receive outdated value after the save.

To implement Raft, you need to implement Finite State Machine (FSM):

package fsm

import (
"encoding/json"
"go-stateful-service/internal/model"
"io"
"log"
"sync"

"github.com/hashicorp/raft"
)

type FSM struct {
mu sync.RWMutex
kv map[string]string
}

type InternalFSM interface {
raft.FSM
Read(string) (string, bool)
}

var _ InternalFSM = (*FSM)(nil)

func NewFSM() InternalFSM {
return &FSM{kv: make(map[string]string)}
}

func (f *FSM) Restore(snapshot io.ReadCloser) error {
data, err := io.ReadAll(snapshot)
if err != nil {
return err
}
var snapshotData map[string]string
if err := json.Unmarshal(data, &snapshotData); err != nil {
return err
}

f.mu.Lock()
defer f.mu.Unlock()
f.kv = make(map[string]string)
for k, v := range snapshotData {
f.kv[k] = v
}

return nil
}

func (f *FSM) Snapshot() (raft.FSMSnapshot, error) {
f.mu.RLock()
defer f.mu.RUnlock()

snapshot := make(map[string]string)
for k, v := range f.kv {
snapshot[k] = v
}

return &FSMSnapshot{snapshot: snapshot}, nil
}

func (f *FSM) Apply(l *raft.Log) interface{} {
if l.Type == raft.LogCommand {
var kv model.KeyValue
if err := json.Unmarshal(l.Data, &kv); err != nil {
log.Printf("Failed to unmarshal log data: %v", err)
return nil
}

f.mu.Lock()
defer f.mu.Unlock()
if kv.Value == nil {
delete(f.kv, kv.Key)
} else {
f.kv[kv.Key] = *kv.Value
}
}

return nil
}

func (f *FSM) Read(key string) (string, bool) {
f.mu.RLock()
defer f.mu.RUnlock()

val, ok := f.kv[key]
return val, ok
}

Raft FSM should implement three methods:

  • Apply — is called once a log entry is committed by a majority of the cluster;
  • Snapshot — returns an FSMSnapshot used to support log compaction, to restore the FSM to a previous state, or to bring out-of-date followers up to a recent log index.
  • Restore — used to restore an FSM from a snapshot. It is not called concurrently with any other command. The FSM must discard all previous state before restoring the snapshot.

Now let’s update the main function:

package main

import (
"encoding/json"
"go-stateful-service/internal/fsm"
"go-stateful-service/internal/model"
"log"
"net"
"os"
"path"
"strings"
"time"

"github.com/gofiber/fiber/v2"
"github.com/hashicorp/raft"
raftboltdb "github.com/hashicorp/raft-boltdb"
)

func main() {
raftPath := os.Getenv("RAFT_PATH")
if _, err := os.Stat(raftPath); os.IsNotExist(err) {
if os.Mkdir(raftPath, 0755) != nil {
log.Fatal("failed to create Raft path")
}
}

raftAddr := os.Getenv("RAFT_ADDR")
raftID := os.Getenv("RAFT_ID")
nodeAddrsStr := os.Getenv("RAFT_NODE_ADDRS")
nodeAddrs := strings.Split(nodeAddrsStr, ",")
if len(nodeAddrs) == 0 {
log.Fatal("no Raft node addresses specified")
}

serverAddr := os.Getenv("SERVER_ADDR")

config := raft.DefaultConfig()
config.LocalID = raft.ServerID(raftID)
config.ElectionTimeout = 2 * time.Second
config.HeartbeatTimeout = 500 * time.Millisecond

logStore, err := raftboltdb.NewBoltStore(path.Join(raftPath, "raft-log"))
if err != nil {
log.Fatal(err)
}
stableStore, err := raftboltdb.NewBoltStore(path.Join(raftPath, "raft-stable"))
if err != nil {
log.Fatal(err)
}

addr, err := net.ResolveTCPAddr("tcp", raftAddr)
if err != nil {
log.Fatal(err)
}
transport, err := raft.NewTCPTransport(addr.String(), addr, 3, 10*time.Second, os.Stderr)
if err != nil {
log.Fatal(err)
}

fsmStore := fsm.NewFSM()
snapshotStore, err := raft.NewFileSnapshotStore(path.Join(raftPath, "raft-snapshots"), 1, os.Stderr)
if err != nil {
log.Fatal(err)
}
raftNode, err := raft.NewRaft(config, fsmStore, logStore, stableStore, snapshotStore, transport)
if err != nil {
log.Fatal(err)
}

var servers []raft.Server
for _, nodeAddr := range nodeAddrs {
s := strings.Split(nodeAddr, "@")
if len(s) != 2 {
log.Fatalf("invalid Raft node address: %s", nodeAddr)
}
servers = append(servers, raft.Server{
ID: raft.ServerID(s[0]),
Address: raft.ServerAddress(s[1]),
})
}

f := raftNode.BootstrapCluster(raft.Configuration{
Servers: servers,
})
if err := f.Error(); err != nil {
log.Println(err)
}

app := fiber.New()

app.Get("/health", func(c *fiber.Ctx) error {
return c.SendStatus(fiber.StatusOK)
})

app.Get("/leader", func(c *fiber.Ctx) error {
state := raftNode.State()
if state != raft.Leader {
return c.Status(fiber.StatusServiceUnavailable).JSON(fiber.Map{
"error": "Not the leader",
})
}

return c.SendStatus(fiber.StatusOK)
})

app.Get("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
val, ok := fsmStore.Read(key)
if !ok {
return c.Status(fiber.StatusNotFound).JSON(fiber.Map{
"error": "Key not found",
})
}
return c.JSON(model.KeyValue{Key: key, Value: &val})
})

app.Post("/key", func(c *fiber.Ctx) error {
var kv model.KeyValue
if err := c.BodyParser(&kv); err != nil {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{
"error": "Invalid request body",
})
}

b, err := json.Marshal(kv)
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to marshal key-value pair",
})
}

if raftNode.State() != raft.Leader {
return c.Status(fiber.StatusServiceUnavailable).JSON(fiber.Map{
"error": "Not the leader",
})
}

if err := raftNode.Apply(b, 500*time.Millisecond).Error(); err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to set key-value pair",
})
}
return c.JSON(kv)
})

app.Put("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
var kv model.KeyValue
if err := c.BodyParser(&kv); err != nil {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{
"error": "Invalid request body",
})
}

b, err := json.Marshal(kv)
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to marshal key-value pair",
})
}

if raftNode.State() != raft.Leader {
return c.Status(fiber.StatusServiceUnavailable).JSON(fiber.Map{
"error": "Not the leader",
})
}

if err := raftNode.Apply(b, 500*time.Millisecond).Error(); err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to update key-value pair",
})
}
return c.JSON(model.KeyValue{Key: key, Value: kv.Value})
})

app.Delete("/key/:key", func(c *fiber.Ctx) error {
key := c.Params("key")
kv := model.KeyValue{
Key: key,
Value: nil,
}

b, err := json.Marshal(kv)
if err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to marshal key-value pair",
})
}

if raftNode.State() != raft.Leader {
return c.Status(fiber.StatusServiceUnavailable).JSON(fiber.Map{
"error": "Not the leader",
})
}

if err := raftNode.Apply(b, 500*time.Millisecond).Error(); err != nil {
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
"error": "Failed to delete key-value pair",
})
}
return c.SendStatus(fiber.StatusNoContent)
})

if err := app.Listen(serverAddr); err != nil {
panic(err)
}
}

Now, the main function creates a Raft node and bootstraps the Raft cluster by providing cluster information. One important thing is that the cluster has a single leader that can accept write requests. Other nodes are read-only.

Now let’s see the changes in compose file:

version: '3.9'

services:
app1:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
RAFT_PATH: /app/data
RAFT_ADDR: app1:9000
RAFT_ID: node1
RAFT_NODE_ADDRS: node1@app1:9000,node2@app2:9000,node3@app3:9000
SERVER_ADDR: :4000
expose:
- '4000'
- '9000'
volumes:
- app1_data:/app/data

app2:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
RAFT_PATH: /app/data
RAFT_ADDR: app2:9000
RAFT_ID: node2
RAFT_NODE_ADDRS: node1@app1:9000,node2@app2:9000,node3@app3:9000
SERVER_ADDR: :4000
expose:
- '4000'
- '9000'
volumes:
- app2_data:/app/data

app3:
restart: always
build:
context: .
dockerfile: Dockerfile
environment:
RAFT_PATH: /app/data
RAFT_ADDR: app3:9000
RAFT_ID: node3
RAFT_NODE_ADDRS: node1@app1:9000,node2@app2:9000,node3@app3:9000
SERVER_ADDR: :4000
expose:
- '4000'
- '9000'
volumes:
- app3_data:/app/data

haproxy:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- '8081:80'
depends_on:
- app1
- app2
- app3

volumes:
app1_data:
app2_data:
app3_data:

This time every application node comes with next configuration:

  • RAFT_PATH — storage path to save logs and snapshots generated by Raft;
  • RAFT_ADDR — address that will be used by Raft protocol;
  • RAFT_ID — node identifier;
  • RAFT_NODE_ADDRS — other nodes in the cluster;
  • SERVER_ADDR — HTTP address

As well, each node needs a volume to store its state.

A node should know other nodes to join to start the cluster. So they are provided with RAFT_NODE_ADDRS variable. After the node joins the cluster, the election begins, and the cluster votes for the leader. After the leader is elected, the cluster can accept incoming requests.

But this time, HAProxy should send write requests only to the leader. To do it, I’ve extended the application API with one endpoint — GET /leader. It returns OK if the current node is the leader node or Service Unavailable if the node is a follower.

Now let’s update HAProxy configuration:

defaults
mode http
timeout connect 5000
timeout client 50000
timeout server 50000

frontend http-in
bind *:80
acl write_methods method POST DELETE PUT
use_backend app-write if write_methods
default_backend app-read-only

backend app-read-only
balance roundrobin
option httpchk GET /health
server app1 app1:4000 check
server app2 app2:4000 check
server app3 app3:4000 check

backend app-write
balance roundrobin
option httpchk GET /leader
server app1 app1:4000 check
server app2 app2:4000 check
server app3 app3:4000 check

This time there are two backends:

  • app-read-only — used for read requests. It uses GET /health check to balance the traffic. Ideally, all nodes can accept incoming traffic;
  • app-write — used for write requests. It uses GET /leader check to balance traffic. Only leader node can accept incoming traffic.

Let’s start the cluster with next command:

docker compose up -d

If you check the logs of every node, you probably will see that one of the nodes is Leader:

go-stateful-service-app1-1  | 2023-09-21T15:20:44.673Z [WARN]  raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
go-stateful-service-app1-1 | 2023-09-21T15:20:44.673Z [INFO] raft: entering candidate state: node="Node at 172.19.0.2:9000 [Candidate]" term=3
go-stateful-service-app1-1 | 2023-09-21T15:20:44.721Z [DEBUG] raft: voting for self: term=3 id=node1
go-stateful-service-app1-1 | 2023-09-21T15:20:44.780Z [DEBUG] raft: asking for vote: term=3 from=node2 address=app2:9000
go-stateful-service-app1-1 | 2023-09-21T15:20:44.780Z [DEBUG] raft: asking for vote: term=3 from=node3 address=app3:9000
go-stateful-service-app1-1 | 2023-09-21T15:20:44.780Z [DEBUG] raft: calculated votes needed: needed=2 term=3
go-stateful-service-app1-1 | 2023-09-21T15:20:44.780Z [INFO] raft: duplicate requestVote for same term: term=3
go-stateful-service-app1-1 | 2023-09-21T15:20:44.780Z [DEBUG] raft: vote granted: from=node1 term=3 tally=1
go-stateful-service-app1-1 | 2023-09-21T15:20:44.893Z [DEBUG] raft: vote granted: from=node3 term=3 tally=2
go-stateful-service-app1-1 | 2023-09-21T15:20:44.893Z [INFO] raft: election won: term=3 tally=2
go-stateful-service-app1-1 | 2023-09-21T15:20:44.893Z [INFO] raft: entering leader state: leader="Node at 172.19.0.2:9000 [Leader]"
go-stateful-service-app1-1 | 2023-09-21T15:20:44.893Z [INFO] raft: added peer, starting replication: peer=node2
go-stateful-service-app1-1 | 2023-09-21T15:20:44.893Z [INFO] raft: added peer, starting replication: peer=node3
go-stateful-service-app1-1 | 2023-09-21T15:20:44.894Z [INFO] raft: pipelining replication: peer="{Voter node3 app3:9000}"
go-stateful-service-app1-1 | 2023-09-21T15:20:44.915Z [INFO] raft: pipelining replication: peer="{Voter node2 app2:9000}"

Another nodes will be followers:

go-stateful-service-app2-1  | 2023-09-21T15:20:44.673Z [WARN]  raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
go-stateful-service-app2-1 | 2023-09-21T15:20:44.674Z [INFO] raft: entering candidate state: node="Node at 172.19.0.3:9000 [Candidate]" term=3
go-stateful-service-app2-1 | 2023-09-21T15:20:44.721Z [DEBUG] raft: asking for vote: term=3 from=node1 address=app1:9000
go-stateful-service-app2-1 | 2023-09-21T15:20:44.721Z [DEBUG] raft: voting for self: term=3 id=node2
go-stateful-service-app2-1 | 2023-09-21T15:20:44.809Z [DEBUG] raft: asking for vote: term=3 from=node3 address=app3:9000
go-stateful-service-app2-1 | 2023-09-21T15:20:44.810Z [DEBUG] raft: calculated votes needed: needed=2 term=3
go-stateful-service-app2-1 | 2023-09-21T15:20:44.810Z [INFO] raft: duplicate requestVote for same term: term=3
go-stateful-service-app2-1 | 2023-09-21T15:20:44.810Z [DEBUG] raft: vote granted: from=node2 term=3 tally=1
go-stateful-service-app2-1 | 2023-09-21T15:20:44.915Z [INFO] raft: entering follower state: follower="Node at 172.19.0.3:9000 [Follower]" leader-address=172.19.0.2:9000 leader-id=node1

In my case, app1 is leader so only this node can accept write requests. Let’s try to stop it by performing next command:

docker compose stop app1

In few seconds, new node will be promoted as a leader. In my case it’s app2:

go-stateful-service-app2-1  | 2023-09-21T15:23:57.800Z [WARN]  raft: heartbeat timeout reached, starting election: last-leader-addr=172.19.0.2:9000 last-leader-id=node1 
go-stateful-service-app2-1 | 2023-09-21T15:23:57.800Z [INFO] raft: entering candidate state: node="Node at 172.19.0.3:9000 [Candidate]" term=4
go-stateful-service-app2-1 | 2023-09-21T15:23:57.817Z [DEBUG] raft: asking for vote: term=4 from=node1 address=app1:9000
go-stateful-service-app2-1 | 2023-09-21T15:23:57.817Z [DEBUG] raft: voting for self: term=4 id=node2
go-stateful-service-app2-1 | 2023-09-21T15:23:57.818Z [ERROR] raft: failed to make requestVote RPC: target="{Voter node1 app1:9000}" error=EOF term=4
go-stateful-service-app2-1 | 2023-09-21T15:23:57.849Z [DEBUG] raft: asking for vote: term=4 from=node3 address=app3:9000
go-stateful-service-app2-1 | 2023-09-21T15:23:57.849Z [DEBUG] raft: calculated votes needed: needed=2 term=4
go-stateful-service-app2-1 | 2023-09-21T15:23:57.849Z [DEBUG] raft: vote granted: from=node2 term=4 tally=1
go-stateful-service-app2-1 | 2023-09-21T15:23:58.066Z [INFO] raft: duplicate requestVote for same term: term=4
go-stateful-service-app2-1 | 2023-09-21T15:24:00.498Z [WARN] raft: Election timeout reached, restarting election
go-stateful-service-app2-1 | 2023-09-21T15:24:00.498Z [INFO] raft: entering candidate state: node="Node at 172.19.0.3:9000 [Candidate]" term=5
go-stateful-service-app2-1 | 2023-09-21T15:24:00.515Z [DEBUG] raft: asking for vote: term=5 from=node1 address=app1:9000
go-stateful-service-app2-1 | 2023-09-21T15:24:00.515Z [DEBUG] raft: voting for self: term=5 id=node2
go-stateful-service-app2-1 | 2023-09-21T15:24:00.551Z [DEBUG] raft: asking for vote: term=5 from=node3 address=app3:9000
go-stateful-service-app2-1 | 2023-09-21T15:24:00.552Z [DEBUG] raft: calculated votes needed: needed=2 term=5
go-stateful-service-app2-1 | 2023-09-21T15:24:00.552Z [DEBUG] raft: vote granted: from=node2 term=5 tally=1
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [DEBUG] raft: vote granted: from=node3 term=5 tally=2
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [INFO] raft: election won: term=5 tally=2
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [INFO] raft: entering leader state: leader="Node at 172.19.0.3:9000 [Leader]"
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [INFO] raft: added peer, starting replication: peer=node1
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [INFO] raft: added peer, starting replication: peer=node3
go-stateful-service-app2-1 | 2023-09-21T15:24:00.655Z [INFO] raft: pipelining replication: peer="{Voter node3 app3:9000}"

Now, if you start app1, it can be promoted as a follower:

go-stateful-service-app1-1  | 2023-09-21T15:25:37.852Z [INFO]  raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:node1 Address:app1:9000} {Suffrage:Voter ID:node2 Address:app2:9000} {Suffrage:Voter ID:node3 Address:app3:9000}]"
go-stateful-service-app1-1 | 2023-09-21T15:25:37.853Z [INFO] raft: entering follower state: follower="Node at 172.19.0.2:9000 [Follower]" leader-address= leader-id=
go-stateful-service-app1-1 | 2023/09/21 15:25:37 bootstrap only works on new clusters
go-stateful-service-app1-1 | 2023-09-21T15:25:38.471Z [WARN] raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
go-stateful-service-app1-1 | 2023-09-21T15:25:38.471Z [INFO] raft: entering candidate state: node="Node at 172.19.0.2:9000 [Candidate]" term=4
go-stateful-service-app1-1 | 2023-09-21T15:25:38.479Z [DEBUG] raft: voting for self: term=4 id=node1
go-stateful-service-app1-1 | 2023-09-21T15:25:38.481Z [DEBUG] raft: asking for vote: term=4 from=node2 address=app2:9000
go-stateful-service-app1-1 | 2023-09-21T15:25:38.481Z [DEBUG] raft: asking for vote: term=4 from=node3 address=app3:9000
go-stateful-service-app1-1 | 2023-09-21T15:25:38.481Z [DEBUG] raft: calculated votes needed: needed=2 term=4
go-stateful-service-app1-1 | 2023-09-21T15:25:38.481Z [DEBUG] raft: vote granted: from=node1 term=4 tally=1
go-stateful-service-app1-1 | 2023-09-21T15:25:38.483Z [DEBUG] raft: newer term discovered, fallback to follower: term=5
go-stateful-service-app1-1 | 2023-09-21T15:25:38.486Z [INFO] raft: entering follower state: follower="Node at 172.19.0.2:9000 [Follower]" leader-address=172.19.0.3:9000 leader-id=node2

HAProxy changes the write-only backend traffic to the leader each time the leader changes by performing a GET /leader request.

All HTTP API methods work the same as for stateless service from the user’s perspective. Under the hood, the Raft algorithm replicates data across the cluster.

As you can see, it’s still possible to horizontally scale stateful service, but this time, you need to think about a lot of details carefully.

Application Code

You can find it here, don’t forget to star the repository!

Conclusion

The ability to scale web applications is one of the most important non-functional requirements nowadays. It affects the availability and performance of the application.

If an application can be scaled only vertically, it’s not enough to call the application “scalable”. Web applications should be scaled horizontally as well.

As you can see, there are different approaches to horizontal scale, and there are a lot of details you need to consider. It’s always possible to scale out a stateless application, so do not store any state inside the application if possible.

For modern HTTP API, it’s preferable to use external databases to store the state instead of storing it inside the application. But if you are developing a database, you probably don’t want to use an external one and need to think about many details when developing modern distributed applications.

Resources

--

--