Adding Custom Syntax to Babel

Jacob Parker
3 min readOct 26, 2016

--

Note that entirely by coincidence, a few hours after posting, an unrelated patch was pushed to Babel that stops the method described in here working.

Before starting, it’s worth noting that this is not officially supported by Babel yet. There is no documentation for what we’re about to do, and the implementation details can change.

With that in mind, we’ll be extending the parser in Babel to allow Swift-style paren-free if, for, while, and do statements.

When choosing the syntax for blocks, you need a way to differentiate the condition from the body. In JS, this is done by requiring parentheses around the condition, Swift requires braces around the body, Python uses a colon, and some other languages have a then keyword.

We can’t allow skipping both parentheses and braces, else you can have ambiguity in statements like if a [1].forEach(…),—are we looping over the property keyed by 1 on a, or are we looping over an array containing just 1? To have unambiguous grammar, we’ll make the requirement that you must have either parentheses (as current), or braces (or both).

Babel Plugins

Babel has a few babel-plugin-syntax-x packages, but all these do is change options in the parser.

But the actual jsx plugin is built into Babylon — the parser used for Babel.

However, we can add our own plugins through unofficial methods. An object containing all plugins can be accessed and modified. With the omission of createSwiftBlocksPlugin, it can be done with the following snippet.

How Babel Parses Code

The package we’re looking at here is Babylon. It reads text, translates it to a series of tokens, and then to a series of nodes. To demonstrate tokens and nodes, if we had the code 1 + 2, the tokens would be a number token, an addition operator token, and another number token; and the nodes would start with an addition node, with both the left- and right-hand-sides being number nodes.

We start with a State class, which contains the current parsing position — the number of characters we are into the code — and an array of tokens. It contains a lot of other things too, but we won’t cover them here.

We then have a Tokenizer class, which contains a State object. It also contains some operations that deal with the tokens, such as next to go to the next token, match to see if the current token matches a token type, eat to go to the next token if it matches a token type, and lookahead to look at what the next token will be.

Finally, we have a Parser class, which extends Tokenizer class. In addition to having the low-level token operations mentioned before, it has some low-level node operations, such as startNode and finishNode, and some higher-level node, such as parseStatement.

It’s worth noting that these higher level node operations are also operating on tokens, and that new tokens are only being read when we call this.next().

Extending the Syntax

A plugin is actually a function that takes one argument, instance. You call instance.extend(hook, customParseFunction) on all the higher-level node operations you need to. Using the above code, we can completely reimplement the parseIfStatement to allow paren-free if statements.

However, in some cases, you don’t want to reimplement everything. Calling inner.call(this, node) will go back to the original parseIfStatement (or go to another plugin).

Another issue is that else ifs won’t work without putting a brace before the if. The completed code looks more like the following.

Closing Words

The default behaviour of Babel is to convert the nodes generated from parsing into new code. Since we were using the same nodes Babel uses normally for if statements, the outputted code will include parentheses.

This post only covered how to add syntax to create already-existing nodes. However, there is also the use case where you want to create new nodes entirely, such as for the pipeline operator. It is possible with Babel, and I’ll aim to cover that in another post.

The code can be viewed on Github.

--

--