Reactive Programming
A couple of months ago, I asked my line manager for some direction and accountability to get me looking into a topic I felt I didn’t know enough about: Reactive Programming.
He provided a long list of questions and some ideas for activities. For the past couple of weeks, I’ve been trying to hook up a full stack application to really show what I think RP can do, but I kept getting caught up on “oh but that’s using this technology instead of that”, and “oh but if you did this on the backend you could do this on the frontend”.
What has really finally sunk in, is that RP is a paradigm. It’s not just Spring WebFlux, or Websockets, or RxJava. It’s an entirely different style of design and application.
With that in mind, I’m going to answer some of the questions that took a while to click for me.
First things first, what is Reactive Programming?
In Reactive Programming, everything is a stream.
Stock system getting a bunch of stock movements due to an influx of online orders? A stream of events.
Mouse clicks stopping then starting? A stream of user input.
Market opening and share prices changing? A stream of data.
The beauty of reactivity — push and pull.
To me, this is a foundational concept of Reactive Programming. Much of RP makes use of two long-standing programming patterns coming together.
The Observable Pattern — Observable A does something of interest, and sends that data to Subscriber B.
:. A is interesting to B, and A is controlling when the next lot of data is pushed to B.
The Iterator Pattern — B is handling the data that A has produced one-by-one. Once B has handled the data/event, it polls A for the next bit.
:. B is interested in A, and B is controlling when the next lot of data is pulled from A.
Similar right? If you join these two patterns together, you get a system where B can push information as it gets it, and A knows that it is there and can pull that information as it needs it. This is where streams come into it — the data needs to sit somewhere in the limbo between being pushed and pulled. In WebFlux, we can see this happening with Mono
and Flux
, but all Reactive Programming implementations have some method of handling asynchronous events and data streams.
What are Hot and Cold Observables?
An observable is a producer of data/events — it’s basically any asynchronous stream that can be subscribed to. A hot observable is pushing data regardless of if anything is pulling it, or intends to pull it. This is a bit like a broadcast, and can have many or no subscribers. The radio is playing music 24/7, but if you were to tune in right now, you won’t get everything you’ve missed. And if you decide you don’t want to listen to Gold 104.3 anymore, you can switch it off but it’s still available — you’re just no longer listening. On the other hand, a cold observable generates data only once there is a subscriber. If there is no subscription, the cold observable is not producing anything. Streaming a song on Spotify is an example of a cold observable — there is one stream per subscriber and the stream is not playing if we haven’t opened up the app and played the song. Each subscriber can pause, skip or play, and it won’t affect anyone else listening to the song.
What kind of systems is it possible to implement in reactive programming, that’s not possible in a traditional system that just relies on a synchronous request/response lifecycle?
In a traditional synchronous web application, we wait for the user to do something with the UI, then the frontend will send a request to the backend. It gets, updates, deletes etc. something. It might hit the database, it might hit other services. It eventually comes back to the frontend with some sort of response.
What’s the issue here? For some applications — very little. Basic RESTFUL endpoints will work just fine.
Speed:
But what if the thing that I’m saving to the DB is completely seperate to what I’m getting from the other server, and both of those actions take 5ms to complete. If we could do them at the same time, we’d get our response back quicker, right?
Design:
What if I don’t want my client to have to send a request at all? What if the user is actually a group of 5 people working at a busy branch, and they are just glancing up occasionally at a screen to check if there are new orders, or late deliveries, or a customer approaching the carpark? In that case, when something changes in another service, or in the data my backend is streaming, I want that information to essentially be pushed to my client without the frontend explicitly needing to know that there is new information for it to consume.
Efficiency:
What if we have thousands of clients making requests simultaneously, and our backend server only has 200 threads available to it. If we get 1000 requests in 1 second, and each one needs to wait 5ms for the other server, and then 5ms for the database, that’s 0.5s (1000 requests / 200 threads * (5+5)ms dead time = 500ms dead time per 1000 requests
) of waiting time for the threads. What if we then get 2000 requests in a second? Our system will spend a full second essentially waiting for things to finish. If we get 2000 requests every second for an hour, we’re screwed — there will be a huge backlog of requests waiting for a thread to be available to start the waiting game all over again. Whilst a thread is waiting for the database to return a value, couldn’t that thread go do something else for a bit?
There are other ways that RP can improve traditional architecture, but these were the three improvements I was looking for RP to provide. As RP is a paradigm, how it is implemented in a tool, language or framework, is what actually determines the improvement we see in these areas.
Reactive Design vs Reactive Programming
Let’s take the Design example, where we want information to be delivered directly to the frontend. How do we make this reactive? The system described is reactive no matter how it is implemented, but there’s certainly better and worse ways you could do it. To be incredibly resource intensive, your client could be polling the backend consistently with HTTP GET requests for the latest data. From a user experience point of view, this is a reactive system, because the user doesn’t need to be doing anything for the screen to update. But it’s not Reactive Programming. RP requires some sort of streaming mechanism, and if it’s implemented right, your speed and efficiency should be improving, not bottlenecking. Two viable solutions for this could be long-polling or Websockets. (Long-polling is where the client will send a request for new information, but the server won’t bother sending a response until there actually is new information, so you’re not wasting bandwidth sending empty HTTP requests back and forth. Websockets seem very cool, and I wanted to play around with them, but they do require a pretty tightly coupled relationship between frontend and backend. Once the initial HTTP requests to handshake and set up the connection are done, you have a fully bi-directional flow of information between server and client!)
RP and a reactively designed system are different things. If we have a Reactive Design, we can probably find ways to avoid RP and deliver the application, but it’s not going to be as efficient as we want it to be. To get the best out of RD, we should be utilising RP.
Okay okay, Reactive Design != Reactive Programming. Can we do Reactive Programming without Reactive Design?
Excellent question. In fact, it’s the question that I was trying to ask, and the question my manager was trying to answer, before I had the concept of RD != RP
down pat.
At my work, we have a whole bunch of Spring WebFlux scattered about throughout our microservices. What does this mean? WebFlux uses Project Reactor — a non blocking framework for JVM projects. We use WebClient instead of RestTemplate both when invoking and responding to HTTP requests. The code looks a bit different to a classic SpringBoot project — you’ll see lots of Mono
and Flux
, and as it’s a functional model, lots of inline transformations in the data you’re handling.
Our database drivers are not reactive, and we don’t utilise the non-blocking abilities of our backend when we call APIs from the frontend.
Is it at all worth us dealing with the more complicated code when our whole system is not reactive? Yes. I think it is. At least, it’s worth it enough to make me want to leave the working code as is. Using WebFlux means that we have better thread utilisation, and in most of the services where we are implementing this, we are either dealing with a gateway that has huge amounts of network traffic, or we are pulling together information from lots of different services that may take time to reply.
Do we have any fully reactive systems in place?
We make excellent use of event-driven processing throughout our services, which is both RD and RP.
If a traditional API is invoked from a reactive service, do we get any benefits?
Yes. When I initially started looking to understand RP, I was very confused by this. In my head, a reactive invocation would have a return type that looked like this:
However, in reality, the traditional API is only going to return one response, when it’s ready, so it looks like this:
How is this still reactive? Even though we are only waiting on one response, WebClient is still treating it like a stream. A stream can have no value, some values, and it can usually tell you if that value is an error and if that value is the last value you’ll be getting. In the case of WebFlux, when we are waiting on just one value, we are handling a Mono
. WebClient engages a thread to fire off the request, then allows that thread to go be used elsewhere. Once the Mono
has a value ready to be read, the system is notified and onNext
or onError
is invoked, and any available thread can come and pick up where the other one left off.
Horizontal vs vertical scaling
Most running instances of an application have about 200 threads available for use. If that’s not enough, you have two options for scaling. You can scale an application horizontally by making more instances available — eg. having multiple pods running in Kubernetes. Or, you can scale vertically, where you make better use of the resources available in an existing instance. Using WebFlux in our services gives us vertical scalability.
What can my team look to improve in the future?
We have microservices that have been implemented using Reactive Programming, which is great, but are we getting the most out of them? We could definitely design our systems in a more reactive way. Things like customer credit limits and stock availability — which do change on the fly — could absolutely be looked into with a reactive lens. We could keep our frontend listening and able to react to updated information coming in from the backend.
By implementing a Reactive DB driver, we would get improvements in our application speed and efficiency, and we would open doors to switching to an even more Reactive system. There is a Spring Reactive database driver — R2DBC (reactive relational database connectivity) — that works for MySQL. However, R2DBC doesn’t yet coexist with Hibernate, and we already make use of event-driven systems throughout our architecture to asynchronously handle the balance of business logic vs waiting to update data in a table. As a future task, I’d be curious to update our database driver in one of our Kotlin services and see if we get any noticeable performance improvements showing up in our automated performance testing.
Is WebFlux the best way to implement RP in a backend service?
Once upon a time, many Java developers relied on RxJava for all of their RP needs. WebFlux has been great for a time, but it’s still so finicky to implement. Complicated code is still complicated code, even it’s doing lots of under the hood vertical scaling.
Queue the wonderful modern alternative — Kotlin. You can’t implement WebFlux in a Kotlin Spring Boot application — you don’t need to, the RP is done with native Kotlin functionality and a library called Kotlin Flow (which may now even be a core feature). I won’t go into all of the details here, but you can read more about all the neat things Kotlin does, including automatically handling backpressure, here.
Playing around with Reactive Programming.
Honestly, there’s so many ways you can show off RP, because there’s so many different cool ways to do it. I started and got working several simple applications, but there were just so many distractions and rabbit holes to go down… Websockets, Kafka, Kotlin Flow, R2DBC, STOMP, server sent events — so many new and exciting things to look into. In hindsight, if you want to dive deep into a few technologies and learn about RP, it would be useful to commit to a project that has a clear reactive design, such as a chatroom application. Personally, my questions around RP stemmed from the fact that a lot of my exposure has been with a system where we are doing RP without RD. It was beneficial to explore a range of different technologies in order to get my head around the fact that RP is more of a paradigm than a specific list of technologies, and it can be harnessed to improve both system design and system efficiency.