Spring-boot 2.0 was released recently and everyone is excited about the new features and improvements.
I love the ease and simplicity with which Spring-boot lets you create services with minimum fuss.
One of the most interesting, critical and note-worthy feature is the addition and support of reactive programming model.
Spring 5 introduced WebFlux framework which is a fully asynchronous and non-blocking reactive web stack that enables handling massive number of concurrent connections. This enables us to vertically scale the service to handle more load on the same hardware.
The intent of this experiment to compare the traditional servlet stack vs the reactive stack by capturing their performance under load.
To achieve this, I have created 3 services as follows:
* A PersonService based on spring-boot web 1.5.10.RELEASE
* A second PersonService based on spring-boot web 2.0.0.RELEASE
* A third PersonService based on spring-boot reactive-web-2.0.0.RELEASE. To keep the code as similar as possible, I have used annotation based approach to create similar `controllers` and `services`. (For the curious, there is another functional way to create routes and handlers).
The real power of reactive stack comes out in interacting with blocking I/O. To simulate that, I have added a registrationService that introduces a delay of 200ms. Each PersonService(s) would call the registrationService and return status 201-Created if successful.
For measuring the performance, I have used Gatling to write my test scenario. The test consists of a user calling the personService 4 times with Person payload (POST). We will change the number of simultaneous users to see how the servlet vs reactive stack behave.
The project is available here https://github.com/raj-saxena/spring-boot-1-vs-2-performance
Here, are my findings:
(I found that the performance of spring-boot web 1.5.10.RELEASE and spring-boot web 2.0.0.RELEASE is mostly similar so just sharing the result of spring-boot web 2.0.0.RELEASE vs spring-boot reactive-web-2.0.0.RELEASE. I have added the reports of my run to the git repo as well).
Testing for simultaneous 2500 Users
We can see that the response times are similar for this load but the number of requests/sec handled by the reactive stack is more than 1.5x that of servlet stack.
Testing for simultaneous 10000 users
By now, we can see that Reactive stack is the clear winner.
While the performance has degraded slightly for the reactive stack, there is a tremendous performance degradation for the servlet stack. It has almost dropped ~8–10 times for 50th and 75th percentile of requests. A couple of requests failed completely each time during multiple runs with the servlet stack.
We can clearly see that the load that reactive stack can withstand without loss of performance is a significant improvement. It enables us to better utilise the hardware specially for services that do a lot of I/O operations like making network calls to other services or interacting with databases. It has similar throughput under low load and swiftly handles high load. This is a big win!
However, working with the reactive stream involves a bit of learning curve and ensuring that no developer ends up calling `block()`. Another deal breaker might be the lack of reactive database drivers. JDBC is inherently blocking and very few other databases like MongoDB, Cassandra support reactive drivers officially at this point.
Do let me know your thoughts.
Processor: 2,3 GHz Intel Core i5
Memory: 16 GB 2133 MHz LPDDR3
Graphics: Intel Iris Plus Graphics 640 1536 MB
I encourage you to try out the experiment yourself with different number of users and share the result.
A note: Capturing performance is a very elaborate and detailed field. It involves a lot of parameters that can affect the results. I ran the tests multiple times before capturing so that things like JVM warming up, byte code optimisation etc don’t vary the results. I have tried to minimise variables and simplify the implementation and keep the testing conditions as even as possible. Even though, I specified large number of simultaneous user, the actual number of requests per second are limited by the capability of my hardware. But, it does serves to give us a sense of how the two stacks perform under stress.