Uncover the JavaScript: Source Code to Machine Instruction Generation

Md. Misbahul Alam Chowdhury
4 min readOct 16, 2018

--

JavaScript engine interprets JavaScript code. Interpretation means, reading and executing the code. Developers don’t need to understand this process very deeply. But they should understand the basic building blocks.

In the previous and very first article of this series, I wrote that I’ll focus on the engine in next article. So, here it is.

It is said that, JavaScript engine is an interpreter. But today this is not 100% true. Modern engines are combination of interpreter and compiler.

Compiler and Interpreter

We, the programmers write code to instruct the machine to do something. But unfortunately, we both speak in different languages. Think about a case where two persons need to communicate, but there is no common understandable languages. Say, I know Bengali, and you know English. So, we have roughly two options to communicate.

  1. I can write the article and translate with a line by line translator like google translate.
  2. I can write the article and send to a professional translator who will read the whole article and then come up with a better translation.

First option produces result will very small latency, but poor in quality. Interpreter follows this approach. It walks through the code and translates to machine instructions and finally execute. Every time the program runs, it goes through the same process.

Second option produces much better output with cost of high latency. Compiler follows this approach. It produces machine understandable program file. Whenever we need to execute the program, we run the machine understandable file instead.

JavaScript engine works as an interpreter, but it works in a little bit compiler fashion. Instead of going through the code line by line, it first roughly scans the portion of the code and takes note. Taking note is called creation of execution context, which will be covered in a future article. Then the engine walks through the code to interpret.

Lexical analysis to generate token chain

Lexical analysis is the very first step of compilation or interpretation. It is also called tokenization. As the name suggest, it simply brakes down source code into tokens. A token must have a name and may have a value. A token could be symbolically define as <name, value(optional)>. Say, we have a code like-

var x = 1;

A tokenizer will brake it down to five tokens-

<KEYWORD, var> <ID, x> <EQUALS> <INTEGER, 1> <SEMICOLON>

Syntax analysis or parsing to generate AST

Syntax analyzer reads through the tokens and analyze against rules of a language.

Say, a rule of the language is-

assignment = ID followed-by EQUALS followed-by INTEGER

According to this rule, x = 1 is assignment. But, var x = 1 or x = y or x = 2.3 are not assignments. Though we know those are also assignments. Defining rules and parsing against these is a complex job.

If token-chain complies with the syntaxes of the language, syntax analyzer generates Abstract Syntax Tree (AST) otherwise throws error. AST is a tree representation of the source code.

Say, x = 1 will be represent as the following tree.

Simple Abstract Syntax Tree (AST)

A more complex example will give a better idea of AST. Say, there is a function like this

function square (x) {    var result = x * x;    return result;}

The AST representation will look like-

Complex Abstract Syntax Tree (AST)

Machine code generation and execution

Once the AST is ready, compiler/interpreter generates machine dependent code from it. Different compiler/interpreter follows different steps to do it. Some generates assembly and some doesn’t. Sometimes code is optimized a little in this stage. I’ll not expand this topic any more detail.

JavaScript JIT compilation

So far we got that, compilation or interpretation are a lot of works. If a piece of code needs to interpret few times, it is totally okay. But what if some code needs to interpret many times? Say, the code lies inside a long running loop. Or, the code inside a frequently called function. Then, it will not be efficient to interpret all the time.

To handle this problem, JavaScript engines use a watcher to count the use of each portion of the code. Based on the report from the watcher, engine decides some code to compile. Even sometimes decides to optimize.

To do compilation, JavaScript engines use Just In Time (JIT) compiler.

Conclusion

In this article, I tried to present an overview of the JavaScript interpretation and compilation. The way the code is processed and executed. But there are way more things to learn about JavaScript. Most important topic coming next is the workflow of the engine. As the title explains, that article will give a good overview of the whole life cycle of JavaScript code execution inside the engine.

Articles in this series

  1. Engine vs Runtime
  2. Source Code to Machine Instruction Generation
  3. Workflow of JavaScript Engine (upcoming)

--

--