How does Elixir compile/execute code?
Elixir always compiles and always executes source code. Both
elixirc do both things.
You read that right, always, compilation and execution.
elixir compiles (in addition to execute),
elixirc executes (in addition to compile).
Main phases of Elixir compilation
elixirc work the same way:
- Load the contents of the file in memory.
- Produce an AST from it using a custom tokenizer and yecc.
- Expand macros, inline functions, …, a bunch of transformations are applied here in what’s known as the expansion phase. That yields an expanded AST, which still conforms to the same spec.
- Transform that final AST into Erlang Abstract Format, which is a standard representation of an Erlang AST using Erlang terms.
- Manually build an abstract format tree for a function called
__FILE__/1in a module called
elixir_compiler_X, where X is an integer, with the abstract format of the program from the step above as function body.
- Compile the result to BEAM assembly on the fly with
compile:forms/2, which returns a binary (no file is written).
- Load said binary into the Erlang VM using the Erlang code server.
elixir_compiler_X.__FILE__/1. Since this function has your whole program as body, the VM is effectively running the program. Check this one-liner in an .ex(s) file, you’ll see it reports that function and module names:
There is some nesting in this process that explains the loop illustrated in the picture above. This is due to the way module definition is implemented, but we’ll leave it here.
elixirc do the same.
elixirc executes top-level and module-level code like
elixir does, it is the same code path.
For example, you can conditionally define a function while compiling. Why? Because the code is being executed. The other way around,
elixir is able to invoke functions in modules defined in the same script. Why? Because they are compiled and loaded into the VM on the fly.
Since programs executed by
elixir are compiled, they run at the speed of compiled modules. Compilation has a penalty, of course, the wall clock time is different, but the code itself runs equally fast.
The main difference between
elixirc is that
elixirc produces a .beam file per module as a side-effect of module definition. It does so by dumping the binary returned by
compiler:forms/2. That’s about it.
Extensions in file names do not matter, .ex and .exs are only conventions.
You can also compile a file that contains five modules, and you’ll get five different .beam files, each named after the module name (regardless of the name of the file defining the modules).
Top-level code or module-level code that does not end in a persisted module attribute or a function is gone in the .beam files. Those files contain module definitions for the VM expressed in object code, Elixir is gone there, those are BEAM programs that could have technically been generated by some other tool.
PS: Thanks a lot to José Valim for reviewing a draft of this post ❤️.