CODEX

Making Your Own JavaScript Linter (part 4)

A comprehensive tutorial

Joana Borges Late
Jan 16 · 5 min read
Image for post
Image for post
A linter running

This is the fourth and last part of a comprehensive tutorial on constructing a JavaScript linter. You can read the third part here.

And here is the source code of dirtyrat in GitHub.

Registering names

As we saw in other parts of this tutorial, when the linter, while parsing token by token, finds a token of kind name, it calls some function to register the name (token) as declared or as used.

Lets’s take a look at the module register.

Basically, registering names means filling dictionaries of the source code file object (rat) using the pattern dictionary[branchedName]=token.

Besides preparing data for later matching of names, the register immediately points to the double declaration error.

Checking names

Although there are no trick parts in the module check-names, its code is a bit hard to read because I recently adapted it to work with JavaScript modules and got no time to polish it. Therefore, we will look at a schematic version of this module:

The real code is more complicated because it cares about organizing the warnings per kind. Thus, it has to fill lists with warnings from each rat (the object that represents a source code file). And after, show list by list.

Remember that we had a problem with the same name having many identifiers (full names)? This is how we solve the problem. We use a function that recursively adjusts the full name (branched name) for an outer scope/block.

Parsing expressions

Dirtyrat has a module for parsing standard expressions and another one for parsing literal expressions. Literal expressions are the ones that have no names (variables or function calls) inside. For the initialization of global variables, dirtyrat accepts only literal expressions.

This is the module for standard expressions:

The code for parsing expressions is fully written in recursive style. Writing the mechanics based on the main loop would only complicate things.

Despite the module scanner being fully linear (serving one token after the other) and the module expression being fully recursive, they match perfectly.

The module Main

There is only one module left to be examined. And it is the first to run: the module main. It starts analyzing the command line arguments and set which files to lint: single JS file, JS files linked by an HTML file, or JS files inside a folder and subfolders. Then it reads each file and sends the text to be parsed. After parsing all files it calls the module check-names.

Bonus: How to parse Python’s indentation

First, we must understand indentation as just another syntax for closing blocks. Opening a block is NOT a duty for indentation. I know new blocks need a new (bigger) indentation. Here is the catch:

It is not the bigger indentation that opens a new block. It is a statement, like “for”, that creates a new block, that demands a bigger indentation.

You may think that it is just blah-blah-blah. But it is a fundamental concept (although subtle), like others, that makes the whole difference when you write the linter.

Second, we must keep our nice architecture, that puts all the info of the source code file into a list of tokens. How do we do this? We must convert different margins in the source code in tokens that have the value and kind equal to “{“ or “}”.

Therefore, the tokenizer has a job to do. We must not change any other module. Any other module will have no idea that the indentation of the source code is meaningful.

We adapt the tokenizer, without messing it.

1) The tokenizer must be aware at the start of each line whether the margin of the line should be ignored. There are three cases. A) It is a blank line. B) The line only contains remarks. C) The line is a continuation of the previous line (which ended with “+”, for example).

2) The tokenizer must “raise” an error when the size of the margin is not a multiple of 4.

3) The tokenizer must memorize the previous indentation and for every 4 whitespaces the current indentation is different from the previous indentation, it inserts a left (or right) curly brace token.

The reasons why I created dirtyrat

I was working with two front-end applications about 100 files each before JavaScript had modules. The performance was critical for both apps. And because the performance was critical I used to test each functionality with different codes.

There was a time when V8 engine running the loop “for (var fruit of fruits) { … }” was 3 times slower than running the loop “for (var i = 0; i < fruits.length; i++) { … }”. No comment.

So, fearing any negative impact on the performance I wrote all the code in the most possible plain simple way. Not structuring functions inside any kind of object or using inner functions to provide privacy. I needed a tool to make checks for me like name collision and typos like “if (a = b) {“. Also, a tool to enforce the code style that I chose.

At that time I tried the famous JSLint. JSLint checked my first file and told me that I was wrong for using “use strict” (only once) on the top of the file. I should remove it from the top and place it inside each function of the file… What could I say? “You are not helpiiiiiiing!”.

So I decided to create my own linter. I took that as a challenge and an opportunity to learn more about programming.

Note: the current version of dirtyrat targets JavaScript modules. You can read more on GitHub.

Conclusion

The initial idea for dirtyrat was just to complement the JavaScript engine checks. Over time, it has been improved. But in no way it was intended for public use. Its role here is just to illustrate the concepts. If I wanted dirtyrat to be widely used I would have named it in a more glamorous way like, say, “CleanCat”.

In case you have any question, please write them in the “responses” section and I will do my best to answer.

Thank you!

CodeX

Everything connected with Code & Tech!

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store