First Step to Tune gRPC in Java

Mahdi Yusefi
4 min readJun 23, 2022

Recently I’ve developed two applications with the power of gRPC for my employer's company. Both applications had been written with a REST framework, particularly Ktor, and I faced challenges in tuning the application server, so now I want to tell a little bit about it.

As you’ve surely known, gRPC is a Remote Procedure Call (RPC) framework that uses HTTP/2 for transport, Protocol Buffers as the interface description language, thus according to many thoughts more suitable choice for web framework in microservices infrastructure over REST.

gRPC provides many features, followed by many open topics on tuning but in this story, I stay focused on the simplest scenario you’ve already had to deal with when you choose one of the REST frameworks for an application. The scenario is, that many requests rush into the server, how should I optimize my gRPC server so that it’ll be able to respond to all of them.

The above simple line of codes binds all your implemented methods inside GrpcApi.java, runs a server on port 5001, and responds to all incoming requests immediately. Regardless of any implemented logic inside the subroutines or Datasource being used to provide data, this configuration for your gRPC server could make your microservice non-functional, and you may wonder, “Was not it supposed to give better performance than REST?”

Take a look at start() method again, we simply gave a port and API class and got a server. ServerBuilder provides a very simple Server with default configurations. By default, this server has a static cached thread pool, and whenever a request reaches it asks the thread pool for a thread and executes the subroutine code. So at any moment, we have a running thread for an incoming request, soon you’ll see that many requests rush into the Datasource, e.g. Redis, and Datasource may not be able to tolerate and respond to these high load requests, and that time application will encounter latency an even faced a crash or failure but don’t worry it’s not the end of the world.

Yes, ServiceBuilder accepts a java.util.concurrent.Executor as a thread pool. Now you have many options to tune your gRPC server based on your desires and application load but for this example, I just used a fixed thread pool. It means gRPC server has only 350 threads to do tasks and execute codes, Also this limitation still can have the previous problem of collecting too many tasks at the same time to execute even if it’s not it may cause request failure, because at some moments there may not be any free thread to accept a request. One may say that we can increase our thread pool size or provide a blocking queue for tasks that there isn’t a free thread for them, as we have it in different java.util.concurrent.Executor implementation, but even all these solutions can work but I have a better solution and there is Netty!

Netty is a non-blocking I/O client-server framework for the development of Java network applications such as protocol servers and clients. Here I have no intention to answer “what’s non-blocking I/O?” or “How it can give us better performance?”, in case you like, check out this link! But how can I use Netty to enhance gRPC server performance? Let’s come back to our famous method.

The above code made a Netty gRPC server based on an NIO channel type. When you choose Netty as your gRPC server you also need to specify a channel type. Channel is a bidirectional flow where I/O operations (read and write) are performed in an asynchronous fashion. Netty implementation for gRPC which is provided by “io.grpc:grpc-netty:$grpcVersion” accepts two kinds of channels EpollServerSocketChannel and NioServerSocketChannel, the first one is the default that I chose NioServerSocketChannel. Shortly the major difference is that Epoll is a Linux implementation of Java abstraction of non-blocking or NIO, you can get more info on the web.

As you’ve seen we have two other options for the Netty server: BossGroup and WorkerGroup. The former is responsible for receiving client connections, and the latter is responsible for network reading and writing and they actually refer to thread processing tasks in a continuous loop. I chose only one thread for accepting connections that is enough because in most cases accepting connections is not time-consuming and five threads for reading or writing actions, keep in mind if you transfer large data you may even need more. Finally java.util.concurrent.Executor for executing GrpcApi methods that all formerly mentioned considerations, must take into account.

As a result, Netty helped me to break down my application overhead points. Simply, I separated a thread for accepting connections and some threads for IO actions instead of just a thread pool to execute both my codes and process requests. If my code requires more threads to execute, I can increase it without being worried about request rushing. Also, I’m not worried about request failure because all threads are busy. Yet, you need to be cautious about choosing required parameters, e.g. number of threads for WorkerGroup, I myself retrieved them through trial and error but there are many powerful tools to calculate.

--

--