Concurrent Executable Diagrams
Recently I’ve been working on building a dependency graph of Maven modules by scanning a large enough number (thousands) of repositories over REST API, reading pom.xml files, and resolving cross-references. I also needed to pull information from some other systems.
It all worked well implemented in pure Java and executing in a single thread. However, it wasn’t “that fast”. Now I need to scale to tens of thousands repositories and “not that fast” would mean “very slow”.
So, I need to implement concurrent loading. Not a big deal! However, it will make my code more complex and less flexible. It will also force me to have more configuration parameters related to concurrent execution.
I already had diagrams depicting the processing logic — I used them to explain the loading logic to myself (“How do I know what I think until I see what I say?”) and my teammates.
With a capability to make diagrams executable already available, all I needed is to expand that capability to make them executable concurrently. This story is an overview of several ways to implement concurrency in diagram execution, all with demos — Concurrent Executable Diagrams, a link to the sources is in the footer.
A quick note about the color theme — it is Minty. I, personally, wouldn’t use it for my web sites, but I’m going through all Bootswatch themes in my demos, now it is Minty’s turn — so it is what it is!
AsyncInvocableEndpointFactory
The first option to implement concurrent execution is to use AsyncInvocableEndpointFactory.
A brief intro into handlers, endpoints, and endpoint factories.
A handler is something which a graph element processor provides so other processors can interact with it. A real-world analogy is a human ear, as shown on the diagram below.
An endpoint is something which is provided to a processor to interact with other processors’ handlers. It can be the handler itself or another object. In the diagram above if two people are close enough then a person on the left talking to the person on the right uses their ear (handler) as an endpoint.
However, if communication is to be done over a considerable distance in space or over a distance in time, then a different endpoint is required. In the above diagram it is a microphone. For a distance in time there would also be a recording device.
So, the endpoint factory is responsible for creating endpoint objects for handler objects. AsyncInvocableEndpointFactory creates endpoints of type Invocable for handlers of type AsyncInvocable. Calls to the endpoint invoke() are executed in separate threads, results are delivered by completable future.
A method can be wrapped into AsyncInvocable using wrap attribute of IncomingHandler annotation:
@IncomingHandler(wrap = HandlerWrapper.ASYNC_INVOCABLE)
public Message chat(Message request) throws InterruptedException {
...
}
On the below diagram Alice communicates with Bob asynchronously, while Bob communicates with Carol synchronously:
You can find more details on the demo web page and in the source files.
This approach can be extended to executing on different machines or at much later time when the current JVM is not around anymore.
AsyncInvocableConnectionProcessor
With this approach there is no endpoint factory, concurrent execution is implemented by a connection processor from Alice to Bob. Thread pool size is passed to the processor as a URI fragment. This allows to have multiple thread pools for different tasks.
The above diagram uses the PCB metaphor from the Executable (computational) graphs & diagrams story.
Similar to how capacitors shift phase in electrical circuits, AsyncInvocableConnectionProcessor “shifts” execution of the handler from the thread which calls the endpoint to another thread.
Using the PCB metaphor we can say that the endpoint factory solders processor pins to the wires. A pin can be a handler if it receives a signal from another pin, or an endpoint if it sends a signal. The soldering joint may transform handlers to endpoints. Another way to think about it is that a no operation endpoint factory solders components directly to the PCB, while AsyncInvocableEndpointFactory solders sockets with built-in capacitors and then plugs processors into the sockets instead of soldering directly to the PCB.
Thread pool container
With the thread pool container approach concurrent execution is still implemented by the Alice -> Bob connection processor. However, the processor does not manage a thread pool — it uses the Thread Pool container processor’s thread pool, which gets injected into the connection processor using @ParentProcessor annotation.
The container approach can be used not only with thread pools, but with other resources as well. For example, GitLabApiProvider to work with GitLab or a database connection. Containers can be nested. E.g. a thread pool container may be nested into GitLabApi container. Inside those containers connection processors would take care of concurrent execution and node processor would use GitLabApi retrieved from the GitLabApiProvider container.
Conclusion
I’ve explained the mechanics and benefits of creating software solutions as executable diagrams in my “General purpose executable graphs and diagrams” story referenced above. Ability to execute those diagrams in multiple threads is one more reason to start using this approach.