Ballerina: Making Sequence Diagrams Work
Creating functions, services, and more with sequence diagrams
[This blog is a continuation of a series of blogs I’m writing about Ballerina, the sequence diagram based programming language we’re creating. See http://ballerinalang.org/ for more information. My last blog was about how Ballerina was conceived; see https://medium.com/ballerinalang/conceiving-ballerina-2dadf67c0503.]
This blog explains how we made sequence diagrams work as a way to write programs and how we make the text syntax and the graphic syntax consistent and equivalent.
This is a continuation of the blogs about Ballerina concepts. You might want to read the previous one too if you haven’t yet: https://email@example.com/conceiving-ballerina-2dadf67c0503
As a programming model, we needed to have capabilities to represent reusable bits of code (i.e., a function). So what’s a function in sequence diagram parlance? Actually that’s easy — its just a particular sequence diagram with as many participants but with a main guy who gets started by some external trigger and then keeps going until the end.
We then realized that what you need is not one sequence diagram but rather many diagrams! Each function is its own diagram.
Services & resources
Given we were developing a system to respond to network requests, we knew we had to have a way to create “services” of some kind. Again there was a lot of debate on whether we should call that a service (which is perfectly accurate technically and English-wise) or call it the more politically correct / market popular, an API. Luckily, sanity prevailed and we now have a “service” concept in Ballerina.
We also knew that a service had to allow one to aggregate a collection of capabilities that work together to offer the service. We liked the abstract terminology of REST and so decided that the unit entity would be called a “resource”. Thus, a service is a collection of resources. Ballerina services (nor resources) are at all limited to HTTP — but that’s ok as neither is REST. Roy (Fielding) may have a fit with Ballerina as we don’t (can’t) force you to write properly RESTful stuff with things called “resource” but that was a conscious compromise.
So what’s a resource from a sequence diagram parlance? Resources are basically functions that get invoked over a network but other than that they have a semantic similar to functions. However, we wanted to design a model where, unlike functions, they don’t have a single entry / single exit model of execution. Also, in sequence diagrams, we always draw the “client” too and show how that interacts with the rest of the participants. So in our design, resources are like functions but we also represent the client as a participant. We are not really using that much yet but I have some ideas on how to use it more to make complex interactions easier to describe — join the firstname.lastname@example.org if you want to get involved!
Once we have a resource, a service is a group of resources but without any time coordination implied. That is, while we may draw many resources using the same participant lifelines, there is no semantic from that they are coordinated or ordered in any way! We had to do it in that way as it’s a common requirement to have shared network endpoints that multiple resources interact with.
What about state and lifecycle?
This brings up the question of lifecycle of functions, resource and services. In the Ballerina approach, functions are inactive entities that get a thread of execution when someone calls them. They themselves can have local state while executing but, other than for actions with side effects, they have no state shared across invocations.
However that approach does not work for services — it’s a very common need to have a shared connection pool to remote entities for a given service and to have some state that is persisted throughout the lifetime of the service. So what’s the lifetime of a service? In Ballerina its basically “forever”. They come into existence at some point after the server starts (with no guarantees as to exactly when) and hang around “forever” after that as long as the server is around. Thus, we allow a service to establish connections to network services, share them across all resource invocations, and to have variables with state that are available consistently across the resources.
Resources are like functions in that they are inactive and have no internal state, but they can manipulate the permanent state of the service. Given many resources can execute in parallel, shared state is accessed in a synchronized way, transparent to the resource code.
The really cool thing with sequence diagrams is of course the ability to describe multiple participants and define each of their behavior. That’s the “worker” concept in Ballerina — basically like a thread of execution that you can program. Any function or resource can have any number of workers. They can be programmed the same way the default worker of a function or resource is — by placing blocks of logic on the line representing the worker. Ballerina has also introduced a coordination mechanism for the parallelly executing workers to interact using messages.
Workers do not share any state with each other. Upon invocation the “parent” worker can pass a message to the new worker and that’s all the state the worker can access. In the case of resource workers, they can also access service state of course.
If any workers are still going when the default worker of a resource or function complete we blast them away. That is primarily to prevent runaway workers.
The fork/join concept is also a cool way to write parallel code in Ballerina. That’s basically a pre-canned pattern of creating workers and waiting for them to complete. See the language spec for more info: https://github.com/ballerinalang/ballerina/tree/master/docs/specification.
Representing network endpoints
What about network endpoints that the Ballerina program is interacting with? We wanted to represent them as participants too of course as that’s what we have always done when explaining complex interactions in sequence diagrams.
However, they’re not the same as workers because we don’t program them directly: they are under someone else’s control and are already “running”. Thus, in Ballerina the participant lines for network endpoints are there only to represent the remote system. As such, we can only show interaction points and not actually program their behavior.
Graphically, we distinguish between the workers and network endpoints (which we call “connectors”) by using a different style to draw the participant in the diagram. The current Ballerina Composer simply uses shading but more visual cues will be introduced to make clear that they are passive representations of networked endpoints and not programmable workers.
Showing interactions with network endpoints
We carefully distinguish between an interaction with a connector (which we call an “action”) and a normal function call. This is to avoid one of the fallacies of distributed computing: that you can hide the network. That is, both textually and graphically, network interactions are clearly distinguished from local invocations. The graphic representation draws a line to the connector line from the worker line.
To ensure that a program could be viewed both graphically and textually, we forced a text syntax compromise: action invocations are statements by themselves and cannot be used as part of expressions. That is, if you want to interact with a connector, then you need to write that as a statement of its own and save the result into a variable (if appropriate). You can’t just take an action invocation and put as part of an expression. Thus, we don’t treat action invocations as a random stub function as is done in most programming languages that don’t understand networked interactions as a first class concept.
message response = http:ClientConnector.post(tweeterEP, tweetPath, request);
Note that “connector” is a very general concept in Ballerina. We support network endpoint access (e.g. HTTP, HTTP with Basic Auth, HTTP with OAuth, JMS) as well things that we don’t normally think of as “networked endpoints” such as databases. We plan to keep extending the concept to any remote service as we go along.
Why not still dataflow for a worker?
While using sequence diagrams to program multi-party interactions, we could have easily continued to use dataflow as the programming model of a single participant. However, we have chosen to use a traditional imperative style of programming with variables, assignment, iteration and all the usual good stuff.
Why? What we’ve seen over the years is that while dataflow diagrams are awesome for simple stuff and in fact are beautiful to view graphically in those cases, they really don’t scale very well for complex things. That is, they fail the test of maintaining linear complexity growth, even if we grant the benefit of a high gradient! We basically had grown tired of the various hacks we had to put in to make dataflow work for scenarios that were not naturally dataflow in nature.
Of course imperative style is no panacea either — c’est la vie.
In a future blog I will explore why we made Ballerina into a full programming language and not a configuration language or a DSL. And various other topics that will help those interested in understanding the why behind Ballerina.
[Next Blog: Thinking about names .. why restrict to English? https://email@example.com/ballerina-thinking-about-names-why-restrict-to-english-c1f9803e827]