Reactive Programming in Java
Reactive programming is a new paradigm that was introduced recently and it has fast become a very popular topic among software engineers. Before we dive into the topic of the article, let’s first look at the reason behind this popularity and how things were before reactive programming was introduced.
Let's look at a traditional web application backend that was developed without using reactive programming.
As shown in the image above, the backend is hosted in a webserver. When a request comes in, it will be assigned to a particular thread by the webserver, and this thread will be occupied with that request until it finishes the process. This process may be calling a database or a 3rd party API which takes time to complete.
If another request comes in, that request will be assigned to another available thread and that thread will also be busy until the process is completed.
Remember, there is a maximum thread count for a particular web server and that count may be based on the web server. If we use spring-boot, the webserver container will be tomcat and that container will have 200 maximum threads.
Drawbacks in the traditional way
- Once a thread is assigned to a request that thread won’t be available until that request finishes its process.
- If all the threads are occupied, the next requests that come into the server will have to wait until at least one thread frees up.
- When all the threads are busy, performance will be degraded because memory is being used by all the threads.
As a solution to the above drawbacks, a team of developers, led by Jonas Boner came together and introduced a new programming paradigm. Well, actually, it was a set of core principals.
It’s important to understand the meaning of reactive in order to understand the reactive programming paradigm. React means a form of response, but for what do we react? We react to events, this means reactive is a response to an event. In this way, we can define reactive programming as an event-driven method of programming.
Reactive programming is a programming paradigm where the focus is on developing asynchronous and non-blocking applications in an event-driven form
- Asynchronous and non-blocking
Asynchronous execution is a way of executing code without the top-down flow of the code. In a synchronous way, if there is a blocking call, like calling a database or calling a 3rd party API, the execution flow will be blocked. But in a non-blocking and asynchronous way, the execution flow will not be blocked. Rather than that, futures and callbacks will be used in asynchronous code execution.
- Event/Message Driven stream data flow
In reactive programming, data will flow like a stream and because it is reactive, there will be an event and a response message to that event. In Java, it is similar to java streams which was introduced in Java 1.8. In the traditional way, when we get data from a data source (Eg: database, API), all the data will be fetched at once. In an event-driven stream, the data will be fetched one by one and it will be fetched as an event to the consumer.
- Functional style code
In Java, we write lambda expressions for Functional Programming. Lambda expressions are functional style codes. In reactive programming, we mostly use this lambda expression style.
- Back Pressure
In reactive streams when a reactive application (Consumer) is consuming data from the Producer, the producer will publish data to the application continuously as a stream. Sometimes the application cannot process the data at the speed of the producer. In this case, the consumer can notify the producer to slow down the data publishing.
Reactive Streams Specification
Reactive Streams Specification is a set of rules or set of specifications that you need to follow when designing a reactive stream. Software engineers from well-reputed companies such as Netflix, Twitter, etc, got together and introduced these specifications.
These specifications introduces four interfaces that should be used and overridden when creating a reactive stream.
This is a single method interface that will be used to register the subscriber to the publisher. The subscribe method of this interface accepts the subscriber object and registers it.
This is an interface that has four methods
onSubscribe method will be called by the publisher when subscribing to the Subscribe object.
onNext method will be called when the next data will be published to the subscriber
onError method will be called when exceptions arise while publishing data to the subscriber
onComplete method will be called after the successful completion of data publishing to the subscriber
This is an interface with two methods. The subscription object will be created when the user subscribes to the publisher in the publisher object as discussed earlier. The subscription object will be passed to the subscriber object via onSubscribe method
request method will be called when the subscriber needs to request data from the publisher
cancel method will be called when the subscriber needs to cancel and close the subscription
This is an interface that is extended by both publisher and subscriber interfaces. This interface is not very common but will be used to process the logic of the subscribing and publishing workflow
A reactive library is nothing but the implementation of reactive specification interfaces which we discussed above. Here are some reactive libraries that are available to us:
- Project Reactor
- Flow class in JDK 9
Reactive streams in spring-boot have been developed based on the project reactor.
Because this is Java reactive programming, I will discuss a little bit about project reactor main concepts
Project reactor is one of the main popular reactive libraries in Java. Because this is a reactive library, this is a fully non-blocking reactive stream with backpressure supported. This integrates directly with the Java 8 functional APIs. The main artifact of that project reactor is the reactor-core, and there are some concepts to be discussed in that reactor core.
Flux represents an Asynchronous Sequence of 0-N Items. This is like a stream of 0 to N items, and we can do various transformations to this stream, including transforming it to an entirely different type of 0-N item stream.
As shown in the image, the top side numbers are the N items which was published to the subscriber. It is a type of Flux item stream and the last line represents the completion of that flux stream. The operator box represents the transform operation on that flux stream.
That transformation operation will be applied to the items of flux stream one by one and that is the onNext() method that we discussed earlier. The bottom set of numbers represents the flux stream that was emitted after applying the transformation.
The red X icon represents some error that happened when applying the transformation to that particular item in that flux stream and that is the onError() method that we discussed earlier in section
If there were no errors, the transformation will be applied to all the elements with the onComplete() method that we discussed earlier.
Mono represents only one value stream of items. We can do various transformations to this, including transforming it entirely.
As shown in the image only one item will be published by to the subscriber. The last line represents the completion of the stream. The operator box represents the transform operation for this Mono stream.
That transformation operation will be applied to the item of Mono stream using the onNext() method that we discussed earlier.
The red X icon represents an error that happened when applying the transformation to the item and this is the onError() method that we discussed earlier.
If no error happened, the transformation will be applied to the item successfully and using the onComplete() method.
The best way to examine the performance is to compare both the Reactive and Non-Reactive way. The experiment is as follows
I create two spring-boot services with one endpoint for each. One will use the reactive spring web flux and the other will use normal spring-boot. I also created a Nodejs application in order to call those service endpoints asynchronously.
In the 1st experiment I ran the reactive service. I called the endpoint with 10, 100, 1000, 5000, 10000 number requests and recorded the average time to complete each request and also the time to complete all requests. Then I did the same for the non-reactive service as well. I created some graphs based on this data
Average time to complete each request graphs
(10000 requests for the Non Reactive way could not be handled in my machine due to limited resources. Hence, I assume that it is infinity (INF)).
For both Reactive and Non Reactive, it takes pretty much the same time to complete its process when using a lower number of concurrent request. However, we see a large difference when it comes to a higher number of concurrent requests. With this, we can conclude that the reactive approach is useful for applications that will be used by a large number of users at the same time.
Time to complete all the requests graphs
Once again, for both Reactive and Non Reactive, it takes pretty much the same time to complete its process when using a lower number of concurrent request, and once again, we see a large difference when it comes to a higher number of concurrent requests.
The conclusion is the same, The reactive approach is useful for applications that will be used by a large number of users at the same time. It saves memory by completing all the processes in less time, so upcoming requests can be utilized without delay and failures.
In this experiment, I will test both services with 1000 concurrent requests, but this time I will experiment with different delays. I will record the time that it takes to complete all the requests by each service, as well as the average time to complete one request by each service. The delays are 1sec, 5sec, 10sec, 15sec, 20sec.
Average time to complete each request graphs
Here we can observe the is a quite big difference in the average time between the reactive way and the nonreactive way to complete one request. In the reactive way, it takes much less time to complete one request, which may lead us to conclude that the reactive way is the best approach to develop an applications which can handle heavy processes and time-consuming tasks or time-consuming API calls.
Time to complete all the requests graphs
Here we can observe the is a large difference in the time between the reactive way and the nonreactive way to complete heavy time-consuming tasks. In the reactive way, it takes much less time to complete the bulk of time-consuming tasks. It saves memory by completing all the processes in less time, so upcoming requests can be utilized without delay and failures.
When should you use Reactive Programming
- When you build a web app that will be used frequently, by many users at the same time. If we use the traditional way, all the threads of the webserver will be busy and your app will become too unresponsive.
- When you are building a data streaming application. With this type of application, the data will flow in at all times. For these kinds of applications, the reactive approach is better.
- When you are building a big data and microservices applications. In Big data applications, lots of data will flow through the app, and with microservices the communications between services usually happen through streams.