Ballerina: Concurrency Done Right!
Concurrent programming is not the most easiest thing to do, and may not be the most simplest concepts to grasp for a newcomer, or even for an experienced programmer, to effectively apply it to real-world problems.
In the Ballerina programming language, we have made writing concurrent programs as natural as it can be. There are no special libraries you need to import, nor any complicated concepts that you need to be aware of, to be productive. Ballerina’s roots are from the most natural approach people use to model problems, which are sequence diagrams. In any case where we need to explain a problem and a solution to someone, we can easily draw a sequence diagram to show all the possible interactions and the execution paths of the program. This contains an inherently supported concurrency mode using lifelines in sequence diagrams.
The above image is an actual editable view of a program available in Ballerina Composer. This specific program contains two parallel workers which does their own independent processing, and uses message parsing in order to coordinate the behavior with each other. The best part of Ballerina is, all the language constructs are designed to be compatible with the graphical modelling as well; so editing your program in either graphical or the text mode becomes a seamless experience. This functionality specially shines when it comes to the concurrency programming constructs that are supported.
The main building block of Ballerina execution is a worker. A worker is simply a concurrent execution flow that is defined in any callable unit, which can be either a function, action or a resource. Let’s generalize all of these to be a function. So a function can either have a single function body, which is a list of statements, or it can explicitly define one or more workers, as shown below.
The above code, defines a main function with two workers defined in it. These workers will be executed concurrently when the function is invoked. If you don’t define an explicit worker using a “worker” block, the function content will be used to create a single implicit worker for the function. The message parsing between workers are done using worker send and receive instructions, which basically consists of the format “variable -> worker_name” and “worker_name <- variable” respectively. Listing 02 below shows the textual representation of the program defined at Figure 01.
The message parsing between workers are done in a way that, every “worker send” should contain a corresponding “worker receive” in the receiving worker in the same order. This is to make sure, that there is always a matching send/receive pair, and to avoid the possibility of deadlocks.
You maybe wondering, if we have multiple workers, can we return from multiple workers as well, and when is a function invocation actually finished if there are multiple workers, and only one returns?. The answer is, you can have multiple workers having code that have return statements, but in the runtime, only one worker can actually execute a return statement. At the moment a return is executed from a worker, the function is returned to the caller with that result, and then, the rest of the workers can still continue executing to their completion, but they cannot execute a return statement again, if its done so, an error message will be logged by the runtime. Also, in the case of a void function, by default it will wait for all the workers to complete it execution before the function returns to the caller, or else, a worker can explicitly execute an empty “return” statement, to immediately return to the caller.
Fork/Join is a construct in Ballerina when you want to split the current execution in the middle of a function to be executed concurrently. This will basically fork the execution to multiple workers, and these workers will execute independently, and according to a specific join rule, it will wait for the workers to finish and send their results to a join block. The joining rules contains options such as, waiting for all workers for completion, or wait for a specific number of workers, a specific set of names workers or else, wait until a specific timeout that is specified by the user.
An example of a fork/join can be found below:
Listing 03 shows an order processing scenario, where the inventory check and the credit check sub-processes done in parallel, and the results are later joined together to derive the final result.
Maybe you have a requirement, where you do not know the number of workers you will need, that is, you need to spawn new workers in a dynamic manner to execute in the background, or else, you need to make an existing function run in the background. This is where the asynchronous function invocation of Ballerina comes into play. Here, any function or action invocation you might have, when you’re invoking the function, you can signal the runtime to run it as an asynchronous invocation. This is done using prefixing the function invocation using the “start” keyword. At this point, the function call will immediately return, and the workers inside the target function will be executing in the background. An example of an asynchronous invocation can be seen below in Listing 04.
In every asynchronous function invocation, it will return a “future” object immediately. This future object can be used to get the result of the target function call, by using the await instruction, or else, also it can control operations such as “isDone”, “isCancelled”, “cancel”, to check for the running status of the asynchronous call, or else to cancel it if needed.
In asynchronous calls, when you do “await <future_object>”, this has the same behavior as doing a normal function call, and waiting for the result. That is, in the successful scenario, it will eventually return the result value of the target function, or else, if it throws an error, the await statement will also throw the exact same error to the caller. So in this manner, the asynchronous call can be adapted to your code easily, without needing to consider any other complexities.
Ballerina VM Worker Scheduler
All of the above constructs are possible due to the efficient worker scheduling functionality that is available in the Ballerina Virtual Machine, or the BVM. The BVM follows a fully non-blocking worker scheduling behavior, where the workers are modeled as light-weight concurrent execution units, that uses the underlying operating system threads as efficiently as possible. So if any blocking operation such as network I/O, worker sleeps, locking and others make the current executing worker go into a non-active mode, so it will not be blocking a live thread.
Also, the Ballerina programming model is implemented in a way that, the user does not have to consider about these details, but rather, this is taken care automatically by the system itself. For example, when you do a network related operation such as an HTTP call, it will by default use non-blocking I/O, and release the current worker when the call is done, and the current executing worker will be woken up when the response is returned back to the caller. The following sample code demonstrates this.
In the above code, when the “get” action is called on the endpoint, the worker does a non-blocking call, and the worker will go to an inactive state, and its backing OS thread will be released. So this ensures, that in no place that we waste the OS threads due to network operations etc.. thus making the Ballerina programs use the CPU in the most optimal way. Also, this programming model makes the developer life much easier, where, he does not have to think about separate callbacks and so on, to handle the response of the non-blocking calls, and similar scenarios. Basically, this complexity is hidden from the users in Ballerina, and provides a clean programming model to be used.
This is it for now, I hope you enjoy programming in Ballerina, and keep exploring the many exciting features it has to offer.