Overcoming IO overhead in micro-services

Kislay Verma
The Startup
Published in
6 min readOct 8, 2019

--

One of the biggest overheads of adopting a micro-service architecture is the cost of inter-service communication. The overhead comes in many forms : the latency overhead in network calls, failure of deep call stacks and error handling in distributed states etc. But to my mind, one of the most insidious costs is paid by each service in the resources that are wasted in waiting for completion of network IO.

Thread 1 just “sits around”, waiting for IO to complete

You know how the story goes — Service A makes a call to Service B, and the thread on which the call was made waits around till the response from Service B is received, after which the sequential execution of code begins again. Also known as the one-thread-per-request model, this is the prevalent programming model in most programming languages and frameworks (barring very few).

What’s wrong with a blocked thread?

In computational terms, a thread is a very expensive resource. Some people find this statement strange, since the textbook definition of a thread is a “lightweight process”. How can a thread be expensive? The answer to this lies in hardware and programming models. A thread is a unit of computation, and only one thread can run on a CPU core at a given point of time. This means that though we can extract a lot of juice from our CPU cores and OS using smart scheduling algorithms, having a lot of…

--

--