Ballerina Shell REPL — Implementation Overview

Sunera Avinash
Ballerina Swan Lake Tech Blog
6 min readMar 1, 2021

To get an overview of what Ballerina Shell is, please refer to my previous article.

Ballerina Shell is a REPL for Ballerina that allows developers to quickly prototype and explore language features. This is an experimental feature coming to Ballerina Swan Lake. This article will explore more on the technical implementation of Ballerina Shell.

Ballerina Shell allows users to explore Ballerina Language

Ballerina Shell processes every user input one by one, sequentially. Each snippet follows a specific pipeline that operates primarily in two phases; parsing and invocation. The parsing stage will identify the syntactical structure of the input and categorize it. The invocation stage will execute the identified snippet.

Ballerina Shell: Snippet Pipeline

Now let us go through each phase in detail.

Parsing Phase I — Preprocessor

The preprocessor is the first transformational phase in the Shell. Each user input is sent through the preprocessor to convert it into a list of strings that the later stages can individually process. For example, if there were two function declarations in user input, the preprocessor would split it into two parts.

In the above example, the preprocessor transforms the input into a list of statements as given below.

Input to preprocessor: 
import ballerina/io; function hi (string name) { io:println("Hi " + name + "!!"); }; hi("Sunera")
Output from preprocessor:
- import ballerina/io;
- function hi (string name) { io:println("Hi " + name + "!!"); };
- hi("Sunera")

Parsing Phase II — Tree Parser

The second stage, the Tree Parsing stage, will identify the syntax tree of the given statement.

The current strategy for identifying the syntax tree is to try parsing it as a declaration, a statement, and an expression and to find the first syntax tree that contains no syntax errors.

After this phase, the below statements will translate into the syntax trees with the following nodes as the root nodes.

Input to tree parser: 
- import ballerina/io;
- function hi (string name) { io:println("Hi " + name + "!!"); };
- hi("Sunera")
Output from tree parser(Root Node of the Syntax Tree):
- ImportDeclarationNode
- FunctionDefinitionNode
- FunctionCallExpressionNode

Parsing Phase III — Snippet Factory

The third and final stage of parsing in Ballerina Shell is the Snippet Factory. This maps each node into a specific category to create a snippet out of them. The way the snippet is processed will differ from each other depending on the snippet type. In Ballerina Shell, there are five main snippet types, and they are shown in the following table.

Snippet types in Ballerina Shell

As explained, the user input gets converted from a string to a snippet that is well categorized. The following diagram depicts the parser stages mentioned above for further clarification.

Parsing Stages

Invocation Phase

Now we have a collection of snippets, each identified, parsed, and ready to execute. The execution will take part in the Invoker. The invocation strategy used to invoke snippets differs slightly because of the snippet type but still follows a generally similar approach.

  1. Wrap the user input with a generated wrapper to make it an independently executable file. For example, we can wrap statement snippets with the main function to make them executable.
  2. Compile the file to make sure there are no errors and to create an executable.
  3. Execute the file if the snippet can produce an output. If not, we do not have to execute. (eg: Import Declaration Snippets, Module Member Declaration Snippets)
  4. Report the output to the user. If there was an error, report them as well.

Reading the above steps might bring you a question. As we all know, REPLs differ from just running the code segments in standalone files because REPLs maintain a single state. A REPL will keep the same global state throughout a whole session, regardless of how many compilation/runtime errors it faces. But if Ballerina Shell creates individual files for each execution, how does it manage the state? Or rather, how does it make sure that the same global state is maintained in every generated file?

Before that, we have to answer “what consists of the REPL state?”. In my opinion, the REPL state can be divided into two parts; global variable values and declarations. Think of any module-level-declarations (i.e., functions, classes, type definitions,…) Having them in all the subsequent wrappers is enough. We do not have to update them because they simply do not change. But the global variables; their values change in each iteration, with each snippet; the user can easily write statements to mutate the global variable value. So restoring the state is not trivial as just including the definitions in the following wrappers. We have to actually restore the variable values as well.

Above I have given a clue as to how Ballerina Shell loads/saves variables. The variable value gets loaded into the program via the LOAD_VARIABLE function and then restored to the same place via SAVE_VARIABLE. But what should theLOAD_VARIABLE and SAVE_VARIABLE do? These should save and load variables from a memory space shared by each of these programs. In Ballerina REPL, that shared memory space is a simple static java class in the host application (the REPL) that acts as a map between the variable name and its value.

Above, you can see a sequence diagram explaining the same process. This is a very generic approach, but to make this approach viable, you would need a method to make some of the memory shared between each snippet. Ballerina Shell has a static Java class and uses the Java Interoperability of Ballerina to store the values of the variables defined in the REPL and to load the values back into the consequent iterations.

Following is another diagram showcasing the execution of snippets and the state of the memory class.

How Ballerina Shell ‘resumes’ execution

Of course, to assure that the generated wrapper will work in all possible statements/expressions/declarations, we have to do much more processing under the hood. But the basic idea behind the Ballerina Shell is this.

Restrictions and Room for Improvement

This is an experimental tool, and your valuable feedback is highly required to make this a better tool. If you find an issue, please file an issue in https://github.com/ballerina-platform/ballerina-lang/issues.

  1. The current parsing methodology used is not ideal. Current Ballerina Shell implementation follows a trial-and-error approach, but due to the nature of parsing, that operation is very costly in terms of performance.
  2. Because of the way snippet wrapper is generated, supporting some situations such as closures is not trivial. In fact, by using this approach directly, closures will not as intended because their changes will not get saved to the memory. One way to solve this problem is to process user code to remove all references to global variables and replace them with LOAD_VARIABLE or SAVE_VARIABLE directly.

Summary

This post covered the implementation of Ballerina Shell, a REPL for Ballerina. I hope that you got an overview idea of how it works. This is an experimental tool, still in its early development, so there can be drastic changes in the coming months. There will be more features coming soon and your valuable feedback is highly required to make this a better tool. If you find an issue, please don't be hesitant to file an issue.

--

--

Ballerina Swan Lake Tech Blog
Ballerina Swan Lake Tech Blog

Published in Ballerina Swan Lake Tech Blog

Ballerina Swan Lake is an open source and cloud native programming language optimized for integration. Create flexible, powerful, and beautiful integrations as code seamlessly. Developed by WSO2.

No responses yet