The idea of writing such article popped into my mind while working on my Webflow/React transpiler. All I wanted to do was to take a JS code string and transform it in such way that globals won’t be redefined if already so:
At the beginning I thought I could do that with some help from a regular expression; but boy was I wrong.
A regular expression is simply not enough because it ignores the concept of scoped variables completely and works on a string as if it was a plain text. To determine a global variable, what we need to ask ourselves is: Is this variable already declared in the current scope or one of its parent scopes?
The way to go with such question would be breaking down the code into nodes, where each node represents a part in our code and all the nodes are connected with each other in a relational manner. This whole node formation is called AST — abstract syntax tree, which can be used to easily lookup scopes and variables and other elements which are related to our code.
An example AST may look like so:
Example taken from Lachezar Nickolov’s article about JS ASTs.
Obviously, breaking down our code into nodes is not a walk in the park. Luckily, we have a tool called Babel which already does that.
Babel to the rescue
Babel is a project which originally started to transform the latest es20XX syntax into es5 syntax for better browser compatibility. As the Ecmascript committee keeps updating the standards of the Ecmascript language, plug-ins provide an excellent and maintainable solution to easily update the Babel compiler’s behavior.
Babel is made out of numerous components which work together to bring the latest Ecmascript syntax to life. Specifically the code transformation flow works with the following components and following relations:
- The parser parses the code string into a data representational structure called AST (abstract syntax tree) using
- The AST is being manipulated by pre-defined plug-ins which use
- The AST is being transformed back into code using
Now you have a better understanding of Babel and you can actually understand what’s happening when you build a plug-in; and speaking of which, how do we do that?
Building and using a Babel plug-in
First of all I would like us to understand Babel’s generated AST as this is essential for building the plug-in, because the plug-in’s gonna manipulate the AST and therefore we need to understand it. If you’ll go to astexplorer.net you’ll find an amazing compiler that will transform code into AST. Let’s take the code
foo = "foo" as an example. The generated AST should look like so:
As you can see, each node in the tree represents a part of the code, and it’s recursive. The assignment expression
foo = "foo" uses the operator
=, the operand on the left is an identifier named
foo and the operand on the right is a literal with the value
"foo". So that’s how it goes, each part of the code can be presented as a node which is made out of other nodes, each node has a type and additional properties based on its type.
Now let’s say that we would like to change the value
"bar", hypothetically speaking what we will have to do would be grab the corresponding literal node and change its value from
"bar". Let’s take this simple example and turn it into a plug-in.
I’ve prepared a quick template project that you can use to quickly write plug-ins and test them by transforming them. The project can be downloaded by cloning this repository. The project contains the following files:
in.js- includes the input code that we would like to transform.
out.js- includes the output of the code we’ve just transformed.
transform.js- takes the code in
in.js, transforms it, and writes the new code to
plugin.js- the transformation plug-in that will be applied throughout transformation.
To implement our plug-in, copy the following content and paste it in the
foo = "foo"
and the following content to the
To initiate the transformation, simply run
$ node transform.js. Now open the
out.js file and you should see the following content:
foo = "bar"
visitor property is where the actual manipulation of the AST should be done. It walks through the tree and runs the handlers for each specified node type. In our case, whenever the visitor has encountered a node of type
AssignmentExpression node, it will replace the right operand with
"bar" in case we assign the the
"foo" value to
foo. We can add a manipulation handler for any node type that we want, it can be
Literal, or even
Program, which is the root node of the AST.
So going back to the main purpose of for which we gathered, I’ll first provide you with a reminder:
We will first take all global assignments and turn it into member assignment expressions of
window to prevent confusions and potential misunderstandings. I like to start by first exploring the desired AST output:
And then writing the plug-in itself accordingly:
I will now introduce you to 2 new concepts that I haven’t mention before but are being used in the plug-in above:
typesobject is a Lodash-esque utility library for AST nodes. It contains methods for building, validating, and converting AST nodes. It’s useful for cleaning up AST logic with well thought out utility methods. Its methods should all start be equivalent to camel cased node types. All types are defined in
@babel/types, and further more, I recommend you to look at the source code as you build the plug-in in order to define the desired node creators’ signatures, since most of it is not documented. More information regards
typescan be found here.
- Just like the
scopeobject contains utilities which are related to the current node’s scope. It can check whether a variable is defined or not, generate unique variable IDs, or rename variables. In the plug-in above, we used the
hasBinding()method to check whether the identifier has a corresponding declared variable or not by climbing up the AST. More information regards
scopecan be found here.
Now we will add the missing peace to the puzzle which is transforming assignment expressions into conditional assignment expressions. So we wanna turn this code:
window.foo = 'foo'
Into this code:
if (typeof window.foo === 'undefined') window.foo = 'foo'
If you’ll investigate that code’s AST you’ll see that we’re dealing with 3 new node types:
- UnaryExpression —
- BinaryExpression —
... === 'undefined'
- IfStatement —
Notice how each node is composed out of the one above it. Accordingly, we will update our plug-in. We will keep the old logic, where we turn global variables into members of
window, and on top of that, we will make it conditional with the
So basically what we do here is checking whether we deal with a
window member assignment expression, and if so we will create the conditional statement and replace it with the current node. Few notes:
- Without getting fancy with the explenation, I’ve created a nested
IfStatementsimply because this is what is expected of me, according to the AST.
- I’ve used the
replaceWithmethod to replace the current node with the newly created one. More about manipulation methods like
replaceWithbe found here.
- Normally the
AssignmentExpressionhandler should be called again, because technically I’ve created a new node of that type when we called the
replaceWithmethod, but since I don’t wanna run another traversal for newly created nodes, I’ve called the
skipmethod, otherwise I would have had an infinite recursion. More about visiting methods like
skipcan be found here.
So there you go, by now the plug-in should be complete. It’s not the most complex plug-in out there but it’s definitely a good example for this intro that will give you a good basis for further plug-ins that you’ll build down the road.
As a recap, whenever you forget for any reason how a plug-in works, go through this article. As you work on the plug-in itself, investigate through the desired AST outcome at astexplorer.net and for API docs I recommend you to work with this wonderful handbook.