Mastering Concurrent Processing: A Step-by-Step Guide to Building a Scalable Worker Pool in Go

7 min readMay 26, 2024

(process 10k requests per sec)

Connect me on LinkedIn 🤝 to craft scalable systems

In this blog post, we’ll dive into building a scalable worker pool in Go. This implementation efficiently manages a pool of workers to handle a large number of requests while dynamically scaling the number of workers based on the load. We’ll also discuss potential pitfalls and how to avoid them.

Overview

We’ll create a worker pool with the following capabilities:

Dynamically scale the number of workers based on load.
Handle incoming requests with a timeout and retry mechanism.
Gracefully shut down workers.

Here’s the complete code, followed by an explanation and documentation for each part.

Dispatcher

The dispatcher is responsible for managing the workers and distributing the incoming requests among them. It can dynamically add or remove workers based on the current load and ensures a graceful shutdown of all workers.

AddWorker: Adds a new worker to the pool and increments the worker count. The worker is launched to start processing requests.
RemoveWorker: Removes a worker from the pool if there are more than the minimum required workers. The worker is signaled to stop via the stopCh channel.
ScaleWorkers: Dynamically adjusts the number of workers based on the load. If the load exceeds a threshold and there are fewer than the maximum allowed workers, a new worker is added. If the load is below the threshold and there are more than the minimum required workers, a worker is removed.
LaunchWorker: Launches a worker and increments the worker count. This is typically used for the initial set of workers.
MakeRequest: Adds a request to the input channel. If the channel is full, the request is dropped, and a message is logged.
Stop: Gracefully stops all workers. It waits for all workers to finish processing their current requests. If the timeout is reached, it forcefully stops all workers.

Worker

The Worker struct represents a worker that processes requests. Each worker runs in its own goroutine and listens for incoming requests on a channel.

LaunchWorker: Launches the worker in a separate goroutine. The worker processes incoming requests until the input channel is closed or it receives a stop signal.
processRequest: Processes a single request. It retries the request up to the specified maximum retries if an error occurs or if the request times out.

The Code

// struct.go

package workerpool

import "time"

// Request represents a request to be processed by a worker.
type Request struct {
 Handler    RequestHandler
 Type       int
 Data       interface{}
 Timeout    time.Duration // Timeout duration for the request
 Retries    int           // Number of retries
 MaxRetries int           // Max number of retries
}

// RequestHandler defines a function type for handling requests.
type RequestHandler func(interface{}) error

// interface.go 
package workerpool

import "context"

// WorkerLauncher is an interface for launching workers.
type WorkerLauncher interface {
 LaunchWorker(in chan Request, stopCh chan struct{})
}

// Dispatcher is an interface for managing the worker pool.
type Dispatcher interface {
 AddWorker(w WorkerLauncher)
 RemoveWorker(minWorkers int)
 LaunchWorker(id int, w WorkerLauncher)
 ScaleWorkers(minWorkers, maxWorkers, loadThreshold int)
 MakeRequest(Request)
 Stop(ctx context.Context)
}

// worker.go

package workerpool

import (
 "context"
 "fmt"
 "sync"
 "time"
)

// Worker represents a worker that processes requests.
type Worker struct {
 Id         int
 Wg         *sync.WaitGroup
 ReqHandler map[int]RequestHandler
}

// LaunchWorker launches the worker to process incoming requests.
// It runs in a separate goroutine, continuously listening for incoming requests on the input channel.
// The worker gracefully stops when either the input channel is closed or it receives a stop signal.
func (w *Worker) LaunchWorker(in chan Request, stopCh chan struct{}) {
 go func() {
  defer w.Wg.Done()
  for {
   select {
   case msg, open := <-in:
    if !open {
     // If the channel is closed, stop processing and return
     // if we skip close channel check then after closing channel, 
     // worker keep reading empty values from closed channel.
     fmt.Println("Stopping worker:", w.Id)
     return
    }
    w.processRequest(msg)
    time.Sleep(1 * time.Microsecond) // Small delay to prevent tight loop
   case <-stopCh:
    fmt.Println("Stopping worker:", w.Id)
    return
   }
  }
 }()
}

// processRequest processes a single request.
func (w *Worker) processRequest(msg Request) {
 fmt.Printf("Worker %d processing request: %v\n", w.Id, msg)
 var handler RequestHandler
 var ok bool
 if handler, ok = w.ReqHandler[msg.Type]; !ok {
  fmt.Println("Handler not implemented: workerID:", w.Id)
 } else {
  if msg.Timeout == 0 {
   msg.Timeout = time.Duration(10 * time.Millisecond) // Default timeout
  }
  for attempt := 0; attempt <= msg.MaxRetries; attempt++ {
   var err error
   done := make(chan struct{})
   ctx, cancel := context.WithTimeout(context.Background(), msg.Timeout)
   defer cancel()

   go func() {
    err = handler(msg.Data)
    close(done)
   }()

   select {
   case <-done:
    if err == nil {
     return // Successfully processed
    }
    fmt.Printf("Worker %d: Error processing request: %v\n", w.Id, err)
   case <-ctx.Done():
    fmt.Printf("Worker %d: Timeout processing request: %v\n", w.Id, msg.Data)
   }
   fmt.Printf("Worker %d: Retry %d for request %v\n", w.Id, attempt, msg.Data)
  }
  fmt.Printf("Worker %d: Failed to process request %v after %d retries\n", w.Id, msg.Data, msg.MaxRetries)
 }
}

// dispatcher.go

package workerpool

import (
 "context"
 "fmt"
 "sync"
 "time"
)

// ReqHandler is a map of request handlers, keyed by request type.
var ReqHandler = map[int]RequestHandler{
 1: func(data interface{}) error {
  return nil
 },
}

// dispatcher manages a pool of workers and distributes incoming requests among them.
type dispatcher struct {
 inCh        chan Request
 wg          *sync.WaitGroup
 mu          sync.Mutex
 workerCount int
 stopCh      chan struct{} // Channel to signal workers to stop
}

// AddWorker adds a new worker to the pool and increments the worker count.
func (d *dispatcher) AddWorker(w WorkerLauncher) {
 d.mu.Lock()
 defer d.mu.Unlock()
 d.workerCount++
 d.wg.Add(1)
 w.LaunchWorker(d.inCh, d.stopCh)
}

// RemoveWorker removes a worker from the pool if the worker count is greater than minWorkers.
func (d *dispatcher) RemoveWorker(minWorkers int) {
 d.mu.Lock()
 defer d.mu.Unlock()
 if d.workerCount > minWorkers {
  d.workerCount--
  d.stopCh <- struct{}{} // Signal a worker to stop
 }
}

// ScaleWorkers dynamically adjusts the number of workers based on the load.
func (d *dispatcher) ScaleWorkers(minWorkers, maxWorkers, loadThreshold int) {
 ticker := time.NewTicker(time.Microsecond)
 defer ticker.Stop()

 for range ticker.C {
  load := len(d.inCh) // Current load is the number of pending requests in the channel
  if load > loadThreshold && d.workerCount < maxWorkers {
   fmt.Println("Scaling Triggered")
   newWorker := &Worker{
    Wg:         d.wg,
    Id:         d.workerCount,
    ReqHandler: ReqHandler,
   }
   d.AddWorker(newWorker)
  } else if load < 0.75*loadThreshold && d.workerCount > minWorkers {
   fmt.Println("Reducing Triggered")
   d.RemoveWorker(minWorkers)
  }
 }
}

// LaunchWorker launches a worker and increments the worker count.
func (d *dispatcher) LaunchWorker(id int, w WorkerLauncher) {
 w.LaunchWorker(d.inCh, d.stopCh) // Pass stopCh to the worker
 d.mu.Lock()
 d.workerCount++
 d.mu.Unlock()
}

// MakeRequest adds a request to the input channel, or drops it if the channel is full.
func (d *dispatcher) MakeRequest(r Request) {
 select {
 case d.inCh <- r:
 default:
  // Handle the case when the channel is full
  fmt.Println("Request channel is full. Dropping request.")
  // Alternatively, you can log, buffer the request, or take other actions
 }
}

// Stop gracefully stops all workers, waiting for them to finish processing.
func (d *dispatcher) Stop(ctx context.Context) {
 fmt.Println("\nstop called")
 close(d.inCh) // Close the input channel to signal no more requests will be sent
 done := make(chan struct{})

 go func() {
  d.wg.Wait() // Wait for all workers to finish
  close(done)
 }()

 select {
 case <-done:
  fmt.Println("All workers stopped gracefully")
 case <-ctx.Done():
  fmt.Println("Timeout reached, forcing shutdown")
  // Forcefully stop all workers if timeout is reached
  for i := 0; i < d.workerCount; i++ {
   d.stopCh <- struct{}{}
  }
 }

 d.wg.Wait()
}

// NewDispatcher creates a new dispatcher with a buffered channel and a wait group.
func NewDispatcher(b int, wg *sync.WaitGroup, maxWorkers int) Dispatcher {
 return &dispatcher{
  inCh:   make(chan Request, b),
  wg:     wg,
  stopCh: make(chan struct{}, maxWorkers), // Buffered channel to prevent blocking on stop
 }
}

// main.go
package main

import (
 "context"
 "fmt"
 "runtime"
 "sync"
 "time"
 wp "workerpool/workerpool"
)

func main() {
 // Set GOMAXPROCS to the number of available CPUs
 numCPU := runtime.NumCPU()
 runtime.GOMAXPROCS(numCPU)
 fmt.Printf("Running with %d CPUs\n", numCPU)

 // Configuration
 bufferSize := 50000
 maxWorkers := 20
 minWorkers := 3
 loadThreshold := 40000
 requests := 50000

 var wg sync.WaitGroup
 dispatcher := wp.NewDispatcher(bufferSize, &wg, maxWorkers)

 // Start initial set of workers
 for i := 0; i < minWorkers; i++ {
  fmt.Printf("Starting worker with id %d\n", i)
  w := &wp.Worker{
   Wg:         &wg,
   Id:         i,
   ReqHandler: wp.ReqHandler,
  }
  dispatcher.AddWorker(w)
 }

 // Start the scaling logic in a separate goroutine
 go dispatcher.ScaleWorkers(minWorkers, maxWorkers, loadThreshold)

 // Send requests to the dispatcher
 for i := 0; i < requests; i++ {
  req := wp.Request{
   Data:    fmt.Sprintf("(Msg_id: %d) -> Hello", i),
   Handler: func(result interface{}) error { return nil },
   Type:    1,
   Timeout: 5 * time.Second,
  }
  dispatcher.MakeRequest(req)
 }

 // Gracefully stop the dispatcher
 ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 defer cancel()
 dispatcher.Stop(ctx)
 fmt.Println("Exiting main!")
}

Main

The main function initializes the dispatcher and workers, sends requests to the dispatcher, and stops the dispatcher gracefully.

Sets GOMAXPROCS to the number of available CPUs.
Initializes the dispatcher and starts the initial set of workers.
Sends requests to the dispatcher.
Gracefully stops the dispatcher with a timeout.

We’ll adjust parameters such as context timeout, buffer size, and minimum/maximum workers to maximize requests per second (RPS) and improve application performance.

Stay tuned for practical insights and real-world examples!

If you read uptill now, then I hope you liked this article and if you like this article then please Clap, as it motivates me to help the community.

Please comment if you found any discrepancy in this article or if you have any question related to this article.

Thank You for your time.

Connect me on LinkedIn 🤝

Mastering Concurrent Processing: A Step-by-Step Guide to Building a Scalable Worker Pool in Go

Overview

Dispatcher

Worker

Main

Written by Sourav Choudhary