elfconv: an experimental AOT compiler that translates Linux/AArch64 ELF binary to WebAssembly.

Masashi Yoshimura
nttlabs
Published in
3 min readJan 10, 2024

Language supports of WebAssembly

WebAssembly (WASM) is a binary format having a virtual instruction set and has been created to make up for the performance overhead of Javascript on browsers, WASM has been expected for the usage of sandbox environments and portable distribution of applications. Now, many programming languages support WASM (e.g. C, C++, Rust, Go), and therefore we can easily make WASM applications.

However, it is difficult for us to make WASM applications in the following case.

  1. The programming language doesn’t support WASM.
  2. The source code is not available (e.g. the source code is not public under license).
  3. It takes time and effort to build the source code for reasons such as the versions of the libraries to build are old.

What is elfconv?

Considering such facts, we have implemented “elfconv”, an experimental AOT compiler that translates Linux/AArch64 ELF binary to WebAssembly. Using this tool, we can easily make a WASM application from only the Linux/AArch64 ELF binary.

https://github.com/yomaytk/elfconv

(Note: “elfconv is a work in progress”, so the capabilities are limited. You may fail to compile your ELF binary or execute the WASM application even if it could compile.)

How it works?

In this project, we convert the Linux ELF binary (currently supports only AArch64) to the LLVM bitcode targeting WASM, and we compile the LLVM bitcode to WASM binary using the existing compiler such as emscripten. In converting to LLVM bitcode, elfconv uses remill, the library for lifting CPU instruction to LLVM IR instruction. remill supports multiple CPU architectures, AArch64, SPARC32, SPARC64, x86, and amd64 (Please see the README.md for more details). Furthermore, the Linux ELF binary possibly has any Linux system calls, so we statically linked the file that emulates Linux system calls now. Fig.1 shows the whole architecture of elfconv.

Fig.1 Architecture of elfconv project

We prepared a simple demonstration that shows how elfconv compiles the Linux/AArch64 ELF binary to WASM and emscripten executes it. The ELF binary used in the demo is a simple program that calculates 100 prime numbers in ascending order and doesn’t use libc because elfconv takes more than 1–2 minutes to compile it. In this demo, we use elfconv/dev.sh to generate the WASM binary.

Demo

You can easily try elfconv using the docker container (Please see README.md for more details).

Current limitations and future works

elfconv is a work in progress, and the features are limited as follows.

  1. supports only not stripped ELF binary (Now, elfconv uses a symbol table to specify every function).
  2. many Linux system calls are not implemented.
  3. supports partially converting CPU instructions.

Furthermore, when targeting WASM, there are some challenges following.

  1. supports dynamic linking
  2. some Linux system calls that are difficult to implement (e.g. fork, exec).
  3. takes a lot of time to compile a large ELF binary.

We continue to develop elfconv and will deal with the above issues while discussing with the WASM community.

NTT is hiring!

We at NTT are looking for engineers who work in Open Source communities like WebAssembly, containers, and so on. Please visit https://www.rd.ntt/e/sic/recruit/ to see how to join us.

--

--