Compilers Theory explained to a 5 years-old child

Tiago Aguiar
Tiago Aguiar
Published in
3 min readApr 14, 2019

For anyone who (like me) never had the opportunity of having a pure Technology background, studying core concepts like the compiler’s theory were hard to digest even by reading a decent amount of books on that particular subject. In this series, I’m going to try my best to chop and delivery dense concepts that might be overwhelming for some people on a rush (or someone with a close test coming). Anyhow, I highly recommend the reading of the fantastic book “You Don’t Know Javascript” by Kyle Simpson A.K.A Getify, and all the magnificent training he provides on-site, or on the internet.

  1. Welcome to the Factory:
Industrial storage area by Charlize Birdsinger

Here’s how it works. Imagine yourself getting into a factory, like most of the factories needs resources from different parts of the world to manufacture and deliver a single product, so is Javascript. Would you be capable of constructing a car if some iron ore, aluminum, petroleum, and copper had been given? Likewise, is Javascript. Essentially the compiler’s duty is to transform a set of unreadable instructions into machine code so the computer can read and execute. Like on most of the great industry warehouses, the compiler must also follow some steps into the process of delivering the car (code).

a) Tokenizing/Lexing.

Think of the iron ore on its natural state. When it gets into the factory it must be turned into 1mm rectangular steel plates and them breaking up into smaller pieces. This is the tokenizing process. The iron ore is a declaration (for example var a = 2). On this stage, the compiler will break up this string of characters into chunks best known as tokens, like var, a, =, 2 (whitespaces may or may not be considered).

b) Parsing.

Once all the steel plates are done pressing, they all get stored in some sort of warehouse, where they get separated and cataloged following their future use. On the parsing stage, all of the chunks that got separated on tokenizing are stored on a stream (array) of tokens and turned it into a tree of nested elements called AST (Abstract Syntax Tree). The catalog that these tokens receive is based on their grammatical structure on the program, var a = 2; might start with a VariableDeclaration (var), with a child node called Identifier (a), pursuant to, Identifier have a child node called AssignmentExpression (=) which itself has a child called NumericLiteral (2)

c) Code-Generation.

When all of the plates are separated based on their future usability, the factory employees can carry them to start making the doors, hoods, and roofs, giving them shape and connecting each other in a set of different parts to what will later on become a fully formed vehicle. Lastly, the code-generator takes the AST instructions and turn it into executable code. And like all capitalist-like great industries do their best to optimize resources so is the Javascript engine trying to optimize the performance of execution, including collapsing redundant elements.

Conclusion

Of course, the compiler process is much wider than what was painted here, but the main idea for what happens under the hood can be represented this way as well. A more in-depth study must be made by those who want to master all the good and bad parts of Javascript.

--

--