Looking at code through the prism of JetBrains MPS

Mikhail Barash
12 min readAug 6, 2018

--

I am starting a series of posts explaining JetBrains MPS, a powerful tool to define and implement domain-specific languages and IDEs for them. I am trying to find a way to explain MPS to those who do not have any idea about abstract syntax trees, metaprogramming, model-to-model transformations, and other “scary stuff”.

I will start “from the middle”; not with what a domain-specific language is, not even with what parsing is. I will rather focus on several small pieces of code. This first post is about how to look at them through the eyes of a language developer — or, as I metaphorically would like to say it — through the prism of MPS.

Disclaimer

This is by no means a tutorial on JetBrains MPS. Instead, this is an explanation of some of the ideas that are behind it. Many points in this post have been (over)simplified for greater clarity. I refer you to a paper “Language Oriented Programming: The Next Programming Paradigm” by Sergey Dmitriev and a post “A Language Workbench in Action — MPS” by Martin Fowler for more details.

The prism of aspects.

What is JetBrains MPS?

This is a language workbench, which, according to Wikipedia, is

a software development tool designed to define, reuse and compose domain-specific languages together with their integrated development environment

Let’s have a look at Scratch, a visual programming language, where the code is not represented as text, but rather as graphical “blocks” with placeholders.

Example of code in Scratch.

Scratch has blocks for representing conditional statements, loops, events (along with many more other kinds of blocks), and is essentially a full-blown programming language.

Fragments of the palette in Scratch. Available blocks are categorized is several groups according to their “semantics”.

Now, what if we want to create our “own Scratch”, with custom kinds of blocks? For example, we might want to define a block for SELECTFROM statement as in SQL, and so on. This is what Blockly allows us to do.

In Blockly, one can define custom blocks that can be used later to build programs. Image credit: https://developers.google.com/blockly/images/input-types.png

Here is the point:

  • Scratch is a visual programming language with an IDE
  • Blockly is also a visual language with an IDE, but it is also a meta-tool: it allows to define visual languages and IDEs for them

Now we can think of JetBrains MPS as a very powerful Blockly :-)

Similarly to Scratch and Blockly, in MPS code is not text, but it may look and feel like text. More on that — in what follows.

Remark

If you wonder why one should even bother with non-textual representation of code, the short answer is: to be able to combine languages with “conflicting” syntaxes. Imagine two semantically different blocks in Blockly that happen to look exactly the same. It surely would not be a problem for Blockly to understand what is what — after all, these two blocks have two different spots on the palette and the IDE knows which one the user selected when “drawing” the code. The same principle applies in MPS.

Let’s forget for a while about fancy visual stuff, and focus on good old Java.

Structure & Editor

Here is a simple if statement in Java:

if (x > y * 2 + 100 && ! myList.contains(200) || this.isVisible()) {
myList.add(x);
myList.add(y);
}
else {
System.out.println("Condition not met!");
}

When you see it, what do you think about? I bet you try to get what it does, that is, you focus on the semantics. Indeed, if checks whether x is greater than something and that myList doesn’t contain something or this is visible. Doesn’t make much sense, I know.

But what if we look at this snippet in a bit more abstract way? First of all, this is an if statement.

And what does an if statement “contain”?

It contains

  • a condition:
x > y * 2 + 100 && ! myList.contains(200) || this.isVisible()
  • statements that are executed if the condition is met:
myList.add(x);
myList.add(y);
  • statements that are executed if the condition is not met:
System.out.println("Condition not met!");

Let’s visualize this.

Statements that are executed if the condition is met are denoted withtrue_block, and statements that are executed if the condition is not met, are denoted with false_block.

What could be, so to say, the “class” of each of these elements? Well, condition is an expression, true_block will contain statements, and so will false_block.

Every if statement always has exactly one condition, and may or may not have statements in either true_block or false_block.

Voilà! Now we’ve got the structure of an if statement. This is also called abstract syntax.

A thing like “if statement” is called a concept. Examples of other concepts are “while statement”, “variable declaration”, “expression”, and so on. It is, if you wish, a kind/sort/type of a block in Blockly.

After we have fixed the structure of our concept “if statement”, we can start thinking how it may look like in code. We can freely choose this representation (called concrete syntax), be it Java-like syntax with curly braces, or Pascal-like with begin end. It actually doesn’t matter, because in MPS it’s just pretty printing of the structure.

Still, let’s go for Java syntax.

Let’s imagine for a moment that we have a table, and each “word” in this Java representation of if is located in its own cell.

We can now distinguish between four kinds of cells:

  • cells that contain keywords (if , else )
  • cells that contain special symbols (( ,), { , } , and whitespace)
  • cells that represent indentations
  • cells that contain elements from the structure of the concept
Cells with keywords and special symbols have white background, cells with indentations are grayed, and cells with elements from the structure are brown.

Cells with keywords and special symbols are “frozen”: it should not be possible to change them. Indeed, keyword if should always be spelled likeif in Java; condition should be surrounded by parentheses, and so on.

The only thing that a programmer should be able to modify are the three placeholders for condition, true_block, and false_block.

If we stick to these rules, the code will always be syntactically correct. This is very similar to Scratch.

It’s impossible to misspell “if”, or skip “then” in Scratch.

That is essentially how projectional editing in MPS works. Code is not represented as text, but it is rather a semi-graphical pretty printing of a structure. This pretty printing is editable though. You might think of it as “less visual” Scratch.

Multiple projections

What if someone prefers Visual Basic-like syntax for if statement? No problem! In MPS, we can define multiple syntaxes (called projections) for a single concept.

Moreover, we can switch between different projections when we edit the code.

Editing the structure

All the keywords, parenthesis, curly braces and indentations do not matter when a programmer edits code. This is very similar to Scratch. What matters is the elements that we described in the structure of the concept: condition, true_block and false_block.

When a programmer writes/modifies an if statement, they change the structure of the corresponding concept. This creates “feedback” from the editor to the structure. Keywords, parentheses and indentations cannot be changed — it wouldn’t make much sense anyway if they could.

So far, we have been looking at same concept “if statement” from two “angles”: one is its structure and the other one is its representation. These are aspects of a language, and in MPS these two particular aspects are called (no surprise) Structure and Editor.

In the remainder of this post I will talk about more aspects.

Intentions Aspect

We start with editing experience: it’s crucial for a user of a language, especially when the language is not textual.

Intentions Aspect allows us to define quick fixes, or intentions that are shown in the IDE.

Example of intentions in IDEA.

For our concept “if statement”, we could define intentions for:

  • negating the condition (for example, from x==3 to x!=3, or from x>10 to !(x>10) , and so on)
  • removing else clause with false_block (this will also remove the braces)
  • exchanging true_block and false_block

Actions Aspect

Let’s have a look at Scratch again. Suppose, when “drawing” the code, we chose to have an if statement without the else clause. How do we add an else clause later?

In JetBrains MPS, this is one thing that is fixed via the Actions aspect of a language. It allows to specify which editing actions would change the structure of a concept and consequently its projection.

For example, when we want to add an else clause to our if statement, we just type keyword else after the closing curly brace of true_block. Remember: code in MPS is not text, it’s a semi-graphical projection with text-like editing experience, that’s why “just typing the else” wouldn’t be a valid thing.

Typing keyword “else” after “}” will modify the structure of the concept and add a false_block with corresponding syntax.

This could correspond to something like this in Scratch: imagine keyword else being available on a palette, you move it to the bottom of the existing ifthen block and the block gets transformed into and ifthenelse block. (Again, you cannot “just type else” in this situation in Scratch— same applies to MPS!)

Behavior

Let’s repeat the exercise with thinking about structure of a concept. This time, our concept will be “variable declaration”. For the sake of simplicity, we only consider integer variable declarations.

So, a variable declaration has:

  • name of the declared variable
  • possibly its initial value (which can be also an arithmetic expression, of course)

One small detail here: element name is of primitive (built-in) type, whereas init_value is an instance of concept expression that should be defined by us. Let’s keep “primitive” elements to the left, and “our” elements to the right.

In fact, name is a property of a concept, and init_value is a child of a concept. Here is the difference between them: in a sense, initializing expression is “embedded” into the declaration of a variable, it’s a different concept (that is, concept “expression”) that we “invoke” from the definition of concept “variable declaration”. On the other hand, name of the variable is something that the variable “possesses”. That’s why children and properties, respectively.

For children, we can specify their cardinality. In this case, init_value can either be present or absent, so its cardinality is 0..1. Properties are always present, it makes no sense to specify cardinality for them. We will not go into more details here, though.

Let’s now define the syntax of integer variable declaration — we divert a bit from the conventional syntax and will go with this:

number x equals 10 * 20

Here is how the projection with cells (as explained above) will look like.

When a programmer types number is the code, a fragment with two placeholders is added.

number _____ equals ______

Again, this is similar to Scratch, with only difference that in Scratch you would “draw” a variable declaration, but in MPS you still type it — or, more precisely, its leading keyword (in our case, it’s number; this leading keyword is called alias of a concept in MPS).

Behavior aspect allows you to “customize” the structure of a concept. For example, we might want that whenever a variable declaration is added to code, its placeholder name is pre-filled with abc.

Now when a programmer will type number , they will get

number abc equals ______

with only one placeholder (for the initial value) left to be filled in.

Now we are smoothly approaching what I called “scary stuff” in the beginning of this post.

TextGen

So, suppose we have defined all the concepts that we wanted to have in our language. Now we want to transform

number carPower equals 10 * 20

written in our language into, say, Java:

private int carPower = 10 * 20;

There are two ways to make this happen in MPS. The first one is to generate a string that will contain the desired output.

This is essentially a string interpolation (or, a template expression, if you wish): we have some “fixed” parts (for example, private int) and varying parts (the value of propertyname , and the value of init_value ).

Trick about init_value

If the syntax of expressions in our language differs from Java, for example, if in our language instead of 30+20-50 , we could write 30 plus 20 minus 50 , then we would also need to generate the code for (that is: “to translate” / ”to compile”) init_value . We are not going into details on that in this post.

Anyway, now we have generated the desired string

private int carPower = 10 * 20;

Is this Java code? The answer is both “Yes” and “No”.

Yes, but no.

That string looks like Java code, but nothing could have prevented us to generate gibberish like this:

private variable of type integer carPower := 10 * 20 ;;;

In other words, when we generate code, we do not check whether what we generate makes sense or not: in the end of the day, it’s just a string.

Apparently, MPS has a way to overcome this issue.

Generator

In the second approach to code generation, we specify a correct example of the desired code.

Yes, we do write x and 0 in the generated code, as if they were hard-coded, and we completely ignore name and init_value at this point. This is now a piece of Java code (not just text that looks like code!), and MPS can ensure it is syntactically valid. We couldn’t have skipped or misspelled something — MPS would have complained otherwise and not let us further.

Now we can “bind” (or: “link”) the value of name to x and the value of init_value to 0 .

Again, the trick about init_value applies here (see above).

We associate name with x and init_value with 0. This would look different in MPS, but the idea is still the same.

What we have just seen is an example of a model-to-model transformation. We are not going into more details at this point.

Types & scopes

And let’s again repeat the (already familiar) exercise: think about the structure of concept “assignment statement”.

It contains:

  • left part (that is, the value being assigned to)
  • right part (the value that is being assigned)
A couple of different syntaxes for an assignment statement.

Left part is a variable, and right part is an expression.

TypeSystem aspect allows us to specify that the types of left and right should be the same.

This is a simplified example. In fact, the type of the right-hand side should be a subtype of the left-hand side. Moreover, TypeSystem aspect is capable of much more than just checking the types.

Finally, there is Constraints aspect. This aspect enables even more fine tuning of structure of a concept. For example, it is possible to specify the scopes of visibility: in our example, we might want that the variable in the left-hand side part of the assignment statement is either a global variable or a local variable (but not a field of a class, for example — assuming we had classes in our language).

When a programmer will be typing an assignment statement, the autocomplete for the left-hand side part will only suggest local and global variables.

Other aspects

We do not talk in this post about several other aspects that MPS allows to specify for a language, for example, DataFlow aspect. Mind you, it is even possible to define custom language aspects in MPS!

Wrapping up

Here is a diagram with the aspects covered in this post.

There are three main “pillars”:

  • Structure (which is in the center of everything for the language developer) supplemented by Behaviour, Constraints and TypeSystem
  • Editor (which is in the center of everything for the user of a language) connected to editing Actions and user-friendly Intentions
  • code generation aspects: TextGen and Generator — two very different aspects that share the same goal — are center of everything “for the computer”

Acknowledgements

I am grateful to Mikko Pitkänen (University of Turku, Finland) for discussions on aspects of languages in JetBrains MPS.

--

--