Peephole wisely with android r8
There are plenty of blogs and documentation about d8 and r8 that we all are pretty much aware of. We also know that this is the standard compiler used by android from Android studio 3.3 beta.
As we all know the Java or Kotlin code is converted to .class files and then go through a combined de-sugaring and optimization techniques with d8 and r8 as a single-step process that was happening in multiple steps before. We can also say that r8 replaces Proguard in many ways.
I recently went through this blog published in GuardSquare written by their CTO Eric Lafortune — https://www.guardsquare.com/en/blog/proguard-and-r8 and found it interesting and a fair comparison made between ProGuard and r8 though they were little biased to Proguard 🙊 (for obvious reasons).
I wanted to dig a little deeper and check how some of these Optimization concepts practically look in a dex file and wanted to share what I see. We won't be going through the full set of optimizations list that r8 actually offers but only a few that I found really interesting.
Let's get d8 and r8 libraries from here — https://r8.googlesource.com/r8 where you can also find the instructions to load them.
$ git clone https://r8.googlesource.com/r8$ cd r8$ tools/gradle.py d8 r8
Constant Folding
Consider the following code,
class Test2 { fun main(args : Array<String>) { val output = 3*6/2+9 println(“lets just analyze” +output) }}
This calculates the value of the math operation defined and just prints the result. Let’s just run the output using the r8 optimizer and see what we get.
To run use the following command,
java -jar r8.jar — lib $ANDROID_HOME/platforms/android-28/android.jar — release — output . — pg-conf rules.txt Test2.classrules.txt-keepclasseswithmembers class * { public static void main(java.lang.String[]);}-dontobfuscate
The rules are pretty similar to that of ProGuard, -dontobfuscate
is just so that we can see the real method names when we peep into the dex file.
000218: |[000218] Test2.main:([Ljava/lang/String;)V000228: 1a00 1900 |0000: const-string v0, “args” // string@001900022c: 7120 0800 0200 |0002: invoke-static {v2, v0}, Lkotlin/jvm/internal/Intrinsics;.checkParameterIsNotNull:(Ljava/lang/Object;Ljava/lang/String;)V // method@0008000232: 2202 0600 |0005: new-instance v2, Ljava/lang/StringBuilder; // type@0006000236: 7010 0400 0200 |0007: invoke-direct {v2}, Ljava/lang/StringBuilder;.<init>:()V // method@000400023c: 1a00 1f00 |000a: const-string v0, “lets just analyze” // string@001f000240: 6e20 0600 0200 |000c: invoke-virtual {v2, v0}, Ljava/lang/StringBuilder;.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; // method@0006000246: 1300 1200 |000f: const/16 v0, #int 18 // #1200024a: 6e20 0500 0200 |0011: invoke-virtual {v2, v0}, Ljava/lang/StringBuilder;.append:(I)Ljava/lang/StringBuilder; // method@0005000250: 6e10 0700 0200 |0014: invoke-virtual {v2}, Ljava/lang/StringBuilder;.toString:()Ljava/lang/String; // method@0007000256: 0c02 |0017: move-result-object v2000258: 6200 0000 |0018: sget-object v0, Ljava/lang/System;.out:Ljava/io/PrintStream; // field@000000025c: 6e20 0200 2000 |001a: invoke-virtual {v0, v2}, Ljava/io/PrintStream;.println:(Ljava/lang/Object;)V // method@0002000262: 0e00 |001d: return-voidcatches : (none)positions :0x0005 line=4locals :0x0000–0x001e reg=1 this LTest2;source_file_idx : 19 (Test2.kt)
Note: If you have prior experience in reading dex files you will know that there are lot of other information we can get from the dex files like the number of direct methods, virtual methods, etc., for simplicity and for everyone’s quick understanding those are trimmed out as we wanted to focus only on the useful output for this scenario.
If you take a look at the output generated above we can clearly see that R8 has optimized and evaluated the value of the expression “3*6/2+9” which is “18” for us already at Bytecode index 0004
without performing any constant multiply, add and divide operation. This might not be much of a benefit since this is a shorter-expression but on tough constant math operations, this is a very useful thing to have.
This is called “Constant Folding” in compiler design terminology.
Redundant load and store elimination
Now let us modify the code we wrote above a bit,
class Test2 { fun main(args : Array<String>) { compute(0, 0, 0, 0, 0) } fun compute(x : Int, y : Int, i : Int, z : Int, w : Int) { var y = y var i = i var z = z var w = w y = x + 5; i = y; z = i; w = z * 3; println(w) }}
If we look at the code above carefully, all we did was to make a separate function that accepts integers and performs a simple calculation with redundant value assignments. The highlighted code above says that it's poorly written due to unnecessary variable load and store. We can clearly see that it's going cause some noise in the dex file creating references to unnecessary redundant operations here. But let's see what it actually does when we run this with r8,
#0 : (in LTest2;)name : ‘compute’type : ‘(IIIII)V’access : 0x0011 (PUBLIC FINAL)code -registers : 6ins : 6outs : 2insns size : 10 16-bit code units0001e8: |[0001e8] Test2.compute:(IIIII)V0001f8: d801 0105 |0000: add-int/lit8 v1, v1, #int 5 // #050001fc: da01 0103 |0002: mul-int/lit8 v1, v1, #int 3 // #03000200: 6202 0000 |0004: sget-object v2, Ljava/lang/System;.out:Ljava/io/PrintStream; // field@0000000204: 6e20 0300 1200 |0006: invoke-virtual {v2, v1}, Ljava/io/PrintStream;.println:(I)V // method@000300020a: 0e00 |0009: return-voidcatches : (none)positions :0x0004 line=16locals :0x0000–0x000a reg=0 this LTest2;#1 : (in LTest2;)name : ‘main’type : ‘([Ljava/lang/String;)V’access : 0x0011 (PUBLIC FINAL)code -registers : 9ins : 2outs : 6insns size : 15 16-bit code units
If we analyze the above dex output we can, of course, find two virtual methods but since we are more interested in compute() let's look at that block. As you can see above I have highlighted the code pieces we are initially interested in and we can say that it has cut off a lot of unwanted assignments and operations we had in our code. R8 has completely turned them into a more efficient code during the optimization. It has just bothered about the two math operations (add, multiply) we had in our code as the rest of the lines were just repeated load and store of variables.
Bytecode index 0000
adds and saves the result to y
. Bytecode index 0002
performs the multiplication on the same variable y(v1)
. At Bytecode index 0004
we copy the value to w and finally, we print it out at index 0006
So the code we wrote has practically turned into something like this,
y = x + 5;// 0001f8: d801 0105 |0000: add-int/lit8 v1, v1, #int 5 // #05w = y * 3;// 0001fc: da01 0103 |0002: mul-int/lit8 v1, v1, #int 3 // #03// 000200: 6202 0000 |0004: sget-object v2 — value is copied to ‘w’println(w)// 000204: 6e20 0300 1200 |0006: invoke-virtual {v2, v1}, Ljava/io/PrintStream;.println:(I)V // method@0003
The above methodology is called as “Redundant load and store elimination” in compiler design terminology.
Summary
In both of our examples above, we saw multiple things,
- Constant Folding,
- Redundant load and store elimination
- Useless operations are deleted.
All of these are techniques of Peephole Optimization — It works on the theory of replacement in which some parts of the program are replaced by shorter and faster code without any change in the result. This technique not only reduces the no of lines of code but can also help in performance and in reducing the memory footprint to an extent.
PS: I am not recommending to write bad little redundant code as r8 is there for rescue 😛 just saying it doesn’t matter as it does really good de-sugaring.
If you liked going through this, click the 👏 below. I notice each one and I’m grateful for every one of them. Do follow me to explore more updates on android.