Androidiots Podcast 5: The Black Magic behind Android Runtime — Part I
How many times have you heard the terms ART, Dalvik, DVM, DEX, AAPT, Zygote Processes etc. and chose to ignore or read later. Well no more. In this episode of AndroIdiots Podcast we talk with Romi Chandra & Amanjeet Singh from Bobble KeyBoard Engineering Team about everything that goes under the hood of Android Runtime.
Android Package(APK), the file format which we are so used to seeing, is the application file format used by Android Operating system to install or download mobile applications. An APK usually consists of manifest file, assets & resources, our source code(classes.dex) and more. AAPT is the tool that allows us to view, create, and update the APK. An important part of this process is creation of classes.dex file.
Java ByteCode & JVM
Java Compiler (javac) converts java source code into a .class files. One .class file gets created for every Java class, interface, or enum in your source code. So if we have an Outer class ABC and inner class DEF in a single .java file will convert to ABC.class and DEF.class respectively. These .class files contain Java Byte-code that can be executed by a JVM.
So what happens on Android devices. Do all Android devices have a JVM for the source code to run. The answer is NO.
Android introduced a new virtual machine called Dalvik Virtual Machine (DVM). DVM is specifically made for handheld devices which have higher constraint on processing power, battery life and memory. Dalvik VM executes files in the Dalvik Executable (.dex) format which is optimised for minimal memory footprint. Rather than creating a .dex file for each and every .class file, the dx tool compiles down all .class files into a single file called classes.dex. The single .dex file contains bytecode for all the source code and libraries in the app.
Dalvik has a few advantages over JVM such as:
1. It uses its own 16 bit instruction set where as JVM uses an 8 bit stack based instruction set which reduces the dalvik instruction count and increases interpreter speed.
2. It leaves a smaller memory footprint i.e. .dex files are smaller than .jar files.
How is ART (& previously DVM) faster than JVM?
Android can be thought of as an application-specific implementation of Linux where application means a phone or a tablet. In Android, every application runs as a separate user. In linux, each user has its own process. So every Application in Android needs a separate process and hence a separate Virtual Machine. Dalvik is highly optimized for running multiple VM instances with as much shared memory as possible. Shared memory of instances does not mean VMs share state too. More here.
Every bytecode boils down to an instruction to the CPU. There is a classification involved in these instruction set too. Read here. This is another differentiator between DVM, which is register based, and JVM, which is stack based.
Stack-based machines must use instructions to load data on the stack and manipulate that data, and thus require a lot more instructions than Register-based machines to implement the same high level code. But the instructions in a Register-based machine need to encode the address of the data (stored in registers) in its instruction. Hence Instruction size is much larger in such machines. This answers why Dalvik uses 16 bit instruction set where as Java uses 8 bit. Thats not to say one is better than the other. A stack-based design makes very few assumptions about the target hardware (registers, CPU features) so it’s easy to implement a VM on a wide variety of hardware.
Here are some more links to deep dive into Dalvik Internals and its instruction sets:
- Google I/O 2008 — Dalvik Virtual Machine Internals
- Deep dive into what makes the Android apps run
- A lightning talk by Jesse Wilson from Square Engineering
ART vs Dalvik
ART was first introduced in Android 4.4 release (KitKat) but wasn't enabled by default. It completely replaced Dalvik from Android 5.0 (Lollipop) onwards.
ART has two main features compared to Dalvik:
- Ahead Of Time Compilation (AOT) in ART vs Just In Time Compilation in Dalvik: AOT means our apps are compiled to native code once which is stored on the phone and run is effectively native, not bytecode. JIT on the other hand compiles bytecode into native code on the fly adding both latency and memory pressure.
- Improved Garbage Collection:
ART came out with better GC plans than Dalvik. It introduced CMS (Concurrent Mark & Sweep) which reduces the number of pauses to one as compared to two in Dalvik. Dalvik’s first pause, which did mostly root marking, is done concurrently in ART by getting the threads to mark their own roots, then resume running right away.
This has enabled ART to be much more performant than Dalvik in making our apps faster at runtime.
ART also introduced Heap Compaction. Compaction means moving along objects in RAM so that some objects are removed (the dead objects, that the GC is supposed to reclaim) and all remaining objects become contiguous in RAM.
An excerpt from source.android.com:
The other main area where the ART GC is different than Dalvik is the introduction of moving garbage collectors. The goal of moving GCs is to reduce memory usage of backgrounded apps through heap compaction. Currently, the event that triggers heap compaction is ActivityManager process-state changes. When an app goes to background, it notifies ART the process state is no longer jank “perceptible.” This enables ART do things that cause long application thread pauses, such as compaction and monitor deflation.
dexopt & dex2Oat
Optimisations do not stop at .dex file creation. In Dalvik VM, there was a tool called dexopt which runs on .dex file and its output is an .odex (Optimized Dex) file. This is very similar to the original dex file, except it uses some optimized instruction set.
In ART, dex2Oat takes a dex file and compiles it to native code. The result is essentially an elf file that is then executed natively. So instead of having bytecode that is interpreted by a VM, it now has native code that can be executed natively by the processor.
With this, we come to end of Part-1 of this two part series. We want to keep AndroIdiot podcast short and crisp but this time it got pretty exciting as we were uncovering the black magic behind Android Runtime. So, we decided to split it in two parts.
We really hope you enjoyed this episode and rest assured the next part is even more intriguing. We will learn about Multidex and the issues associated with it.
You may follow our Subject Matter Experts on these platforms:
Please do provide feedback, questions and suggestions in the comments section.