GHC’s Cross Compilation Pipeline

Today is going to be a slight bit more technical, and less direct practical utility. We will look at the steps that GHC takes to cross compiles code via its LLVM backend.

In GHC’s Compiler Commentary we can see how the front end takes a Haskell file and after Parsing, Renaming, Desugaring, it ends up in GHC’s Core language. The Core is then processed repeatedly by the Simplification pass before being translated into STG and finally Cmm. Cmm is the language from which the three code generation backends in GHC take off.

The Cross Compilation Backend

The LLVM code generator takes in Cmm, and turns it into LLVM intermediate representation. The LLVM IR is then passed through the LLVM optimizer, the LLVM static compiler, GHC’s LLVM Mangler, before it is finally passed off to the assembler, and ends up as object code.


The LLVM intermediate representation can be written either in the textual human readable version or as LLVM Bitcode. LLVM Bitcode is a binary format, that is represented as a stream of bits. Values in the Bitcode format do not necessarily need to align with byte boundaries.

GHC’s LLVM code generator currently produces textual ir. As the textual IR is not guaranteed to be stable across LLVM releases, this is one of the reasons that GHC is usually tied to a specific LLVM release.

LLVM optimizer

The LLVM optimizer opt reads in LLVM IR writes LLVM IR after performing a set of optimizations. The LLVM IR GHC uses GHC’s custom calling convention ghccc, which requires the -mem2reg pass to be run by the optimizer, thus the backend always passes -mem2reg unless the -O<n> flag that is passed from GHC to the optimizer is greater than 0. In which case the optimizer runs -mem2reg anyway.

LLVM static compiler

The LLVM static compiler llc turns the LLVM IR produced by the LLVM optimizer into assembly for the given target.

GHC’s LLVM Mangler

After the LLVM IR GHC produces is fed through LLVM’s optimizer and static compiler, the resulting assembly might need some special attention. Therefore GHC passes the generated assembly through the LLVM Mangler. The mangler currently ensures that -dead_strip has no effect on Mach-O platforms (macOS, iOS, …). Dead stripping on Mach-O platforms breaks GHC’s Tables Next To Code optimization; it requires functions to carry prefix data. LLVM unconditionally inserts .subsections_via_symbols into the assembly. This leads the linker to believe that only code after live function symbols needs to be retained and it then strips away the prefix data, if the previous symbol is considered dead. This should not be needed with LLVM5 anymore! (LLVM: D30770)

The mangler currently mangles two additional items: function to object mangling for ELF, and AVX instruction rewrites to fix AVX stack spills. For AVX GHC essentially lies to LLVM about the stack size being 32byte aligned, but then needs to rewrite the aligned AVX instructions to their unaligned counterparts.

The Assembler

Finally the mangled assembly is turned into .o object code, which is then handed of to the linker. On macOS clang is currently used as the assembler instead of the system assembler.

That concludes our midlevel tour through the GHC’s LLVM backend. Please note that I did not discuss the optional Splitter, and optional MergeForeign phases.