You should prefer Event-based over Thread-based most of the times

Why in a Cloud world, where you pay for “cpu time”, this choice matters even more

Published in

THRON tech blog

5 min readFeb 26, 2019

In contrast with intuition, often a big part of application time is spent waiting for something, this happens all the time when interacting with a database, reading a file or network; every time we wait for something we are wasting cpu cycles and this should always be considered a cost-inefficiency, especially in a cloud world (where the industry is moving towards a pay-per-cpu-time model).

Why thread-based server wastes cpu cycles

In a thread-based server each request is resolved in synchronous mode, so each request is processed in a single thread, when the response is complete the thread returns back to thread pool. If we need to call an external service to resolve the request the thread has to wait for the response:

def thePostService(req: Request): Response = {
  // we are inside the thread which solves the request
  // we are waiting for the external service response
  // if such service takes 3 seconds to respond
  // we waste 3 seconds that we could have used to do something else  val response = new Response()
  val externalResponse = externalService.post(data)
  // do stuff with response
  return response.body(externalResponse.body)
}

An event-based server doesn’t waste cpu cycle

In an event-based server each request is resolved in an asynchronous mode, this means that the thread returns back to thread pool before the response is complete and is ready to serve some other request. There is a single thread (or a small number of threads) that routes and manages all requests. So if we need to call an external service to resolve the request we can’t wait for the response, otherwise we would quickly lock all our threads, for this reason we need a callback logic, usually an handler

// the response has already been provided from caller
def thePostServiceHandler(request: Request, response: Response): Unit = {
  // we are inside the function that solves the request
  // when the external response is ready we can reply with it
  // no waiting here! When external response is ready we reply  externalService.post(data, onResponse = {
    (externalResponse) =>
      // do stuff with response 
      response.send(externalResponse.body)
  })
}

As you can see the code is more complex than the one in the thread-based snippet, but we can use the Future construct (Javascript’s Promise or Haskell’s Future or any similar approach) and simplify it as:

def thePostService(req: Request): Future[Response] = {
  val externalResponseFuture = externalService.post(data)  // we return a Future[Response] which 
  // is completed when the external
  // response returns  externalResponseFuture.map { externalResponse =>
     // do stuff with response
     response.reply(externalResponse)
  }
}

Some numbers (measuring RPS)

Imagine having a single core server. Suppose that a request to a server takes t_cpu CPU milliseconds and takes t_wait time waiting for external stuff, the total response time is t_cpu + t_wait milliseconds.

If the server is thread-based and the server can handle threadNum threads before performance degrade (synchronous) the formula to measure requests per second is:

min((threadNum * (1000 / (t_wait + t_cpu)), (1000 / t_cpu))

We have to consider the smallest number among CPU bound and Thread bound scenarios because even if we increase the number of threads we are running on physically limited CPU resource (thread doesn’t mean CPU core).

If the server is event-based (and asynchronous) we can skip the waiting time and consider only CPU time.

(1000 / t_cpu)

Let’s try some use case and consider the case we can manage 25 threads without overhead ( threadNum=25):

Proxy Server

In a proxy server the usage of CPU is very little so suppose a 0.2 ms, the waiting time is the biggest part, suppose 24.8 ms, by performing the aforementioned calculation we achieve the following RPS (requests per second) metrics:

thread-based -> 1000.0
event-based -> 5000.0

The event-based server performs 5 times better than the thread-based server, this is because the event-based server doesn’t waste cpu time waiting while the thread bound is limiting thread-based server’s performance.

Application Server (no external calls)

Another example might be a light application with no external resource calls: an application that just performs extraction from a JWT, or performs some calculation or crypto stuff, so suppose a CPU time of 2 ms and 0 ms for waiting time.

thread-based -> 500.0
event-based -> 500.0

In this case the performance is the same because the workload is cpu bound.

Application Server (with external call)

A classic application, where we require access to a database or the file system, we can suppose a CPU time of 2 ms and waiting time for the external resource of 98 ms

thread-based -> 250.0
event-based -> 500.0

This is a very common use case, the event-based server can handle more users per seconds.

Application Server (high latency external call)

Let’s try with a microservice model, where the application usually calls one or more external resources (other applications), we can suppose a CPU time of 2 ms and waiting time for the external resource of 248 ms

thread-based -> 100.0
event-based -> 500.0

Increasing the waiting time we are increasing the difference in performance between thread-based and event-based.

Application Server (long polling client)

Last but not least an application with a long polling client (e.g. SQS retrieve); in this case the server replies to the client just when it has a message that suits the request and the connection stays open during the waiting time, in this example we can suppose 1 ms of CPU and 9999 ms for waiting time

thread-based -> 2.5
event-based -> 1000.0

Guess there’s nothing to say about that.

Conclusion

In this article we have done some strong simplifications and quite basic examples, ignoring some of the overheads on both event-based and thread-based servers, but I’m quite sure that these overheads are meaningless in the end result.

As you can see everything depends on the balance between CPU time and waiting time: an event-based thread performs better when the difference between waiting time and cpu time is tangible and this is true on most of web applications. Otherwise thread-based and event-based servers perform almost the same.

IMHO event-based is the preferable choice if the language has a big asynchronous ecosystem; in Scala (our main language of choice) using event-based servers is simple and we see a common trend for new libraries to go towards the asynchronous (reactive) direction.