We just recently enabled R8 in our Android app. What does this mean? Why do I sometimes use the term proguard and even added
proguard.pro files to the source code? And how do we enabled R8 in the app in several steps? All these questions can be answered in this blog.
What is R8
Generally speaking, R8 is a compiler that converts Java bytecode into “optimized” dex code.
When you want to run your Java (or Kotlin) code on an Android device, the build tool has to perform several steps. First, it compiles the source files into byte code. Secondly, it compiles the byte code into dex code. The dex format (Dalvik executable) can be understood and executed by Android.
By default, dex code (as well as byte code) can be easily decompiled by literally anyone in the world. You “only” need access to the dex code and a Java byte code decompiler. After that, you will see something very similar to the original source code of those files.
You, as a software developer, might not want that. Not only because someone can see what you have programmed, but you may also also want to hide details about you or your company. It could even be security related to you, your company, or the software itself.
This is where R8 comes in. That little word “optimized” previously mentioned, does a little more than just the things I described. It not only compiles the source files into byte code and then into dex code. It also optimizes, shrinks, and obfuscates the byte code while converting it to dex code.
More details, please!
The shrinking technique of the R8 compiler means, that it can detect which classes (methods, variables, etc.) can be removed because they are not used by the program anyways. This is quite common when you add third-party dependencies to your project. Just as an example, you might have added Guava to your project. Guava is a pretty big library in terms of functionality and lines of code. But maybe you only use a handful of utility methods from it. Without R8, you will have all lines of code of the library in your app. No code will be stripped out. With R8 enabled (or executed), all the unused classes (methods, variables, etc.) are removed. The R8 compiler detects unused source code and removes it from the dex code.
Obviously, this results in smaller apps because it contains only the source code you actually need and use.
Furthermore, it also optimizes the source code. As a simple example, it might find a
switch (or a Kotlin
when) statement that might be faster with an
if-elseif-else statement. Then it will convert that
when in byte code into an
if-elseif-else statement in dex code.
Last but not least, it obfuscates the dex code. This means it will convert the original class names you used (as well as package names, methods, variables, anything) into gibberish. For instance, a function that looks like
employee.setName(name) will end up in
employee variable has been renamed to
ab, the method
a, and the
name variable to
This makes it nearly impossible, or at least much more difficult, to understand the source files if an attacker were able to decompile your app.
Nice, what is proguard?
A bit of history. As far as I know, code shrinking and code obfuscation has always been possible on Android. But in the early days of Android, the R8 compiler didn’t exist. (Much like Kotlin. It didn’t exist back then either 😀).
Before Google decided to build its own shrinking and obfuscating tool and named it R8, Android used ProGuard. ProGuard is, like the R8 compiler, a shrinking and obfuscating tool.
Why did Google build its own compiler when it already exists, you might ask? As written above, the format that Android understands is called dex code. But ProGuard doesn’t produce dex code. It produces optimized byte code. Remember what I wrote above about the two steps that happen to produce dex code. Now imagine we add another tool (ProGuard) to this chain. It would look like the following:
- source files to byte code
- byte code to optimized (shrunk and obfuscated) byte code
- optimized byte code to dex code
And this is exactly what was happening before they introduced R8. R8, on the other hand, compiles byte code directly to optimized dex code. With that, we still have only two steps to produce dex code.
The configuration file
In best case, you can simply execute the R8 compiler and your code is shrunk, optimized, and obfuscated. But most of the time you have to add some configuration(s) to prevent obfuscating and/or shrinking for some parts of your code. Why is that? A typical example is reflection. You (or a library you use) might use reflection. When using reflection you don’t have direct “access” to the classes. If this is the case, R8 doesn’t find a direct node to this class and thinks it is not used and will remove (or obfuscate) it. If this is the case, reflection will fail (at runtime), of course. Because the class you want to access via reflection has a different name now.
With a configuration file, you can tell the compiler that it should not remove and/or obfuscate specific classes (methods, variables, etc.). You can also do a lot more with that configuration file. But this is probably the main use case of them 😬. All of the possible options can be found here.
The configuration file needs a special format so that it can be understood by the compiler. ProGuard, the old shrinker which is still a thing in the Java world, have already such a format. Because ProGuard was used before R8 in the Android world, Google decided that they implement R8 in a way that it understands as its ProGuard sibling in that regard. This means ProGuard and R8 can have the same configuration file. There is no need to change and/or adjust anything if you switch from one to the other. I guess this was also one of the main goals for Google in developing R8. As said, ProGuard was used before on Android. When they introduced R8, they wanted developers to seamlessly switch from ProGuard to R8 by literally avoiding adjusting anything. Everything should happen behind the scenes. There were no new configuration syntax or format introduced, no new Android Gradle Plugin methods added or something similar. They continued to use the same wording as they did years ago. For example, there is still the
proguardFiles function which takes the path to the proguard (or R8? 😉 ) configuration files as an argument.
Basically, this is the reason why you might still hear people talking about ProGuard (in Android development) even if this is a thing of the past.
And because of all this, I also still use
proguard.pro files that contain the configuration for… R8. To be honest, it would also look a bit strange to have the following code in a
How we enabled R8 in our app
I’m doing Android development for quite some time now. In the past, I tried enabling ProGuard for an existing app multiple times. As far as I remember correctly, it always failed. I have no idea why that was the case. Maybe ProGaurd is worse than R8? Maybe my source code was worse than nowadays? Maybe the documentation wasn’t as good as today? Maybe the process of how I added it to the app was not that good? Maybe I didn’t have much experience with software development?
I have no idea. I’m just guessing. All I know is that it was always a pain for me to enable it. It ended up in hours of testing, and changing configurations, to finally end up in a crashing app anyways.
To sum it up, I haven’t had a good experience (and relationship) with ProGuard.
Today, the world looks a bit different. The R8 documentation is straightforward, I learned from my previous mistakes, and I am working on a well-architectured app.
Before I explain to you how we enabled it in detail, I want to give you a rough overview of the current architecture of our app first. I have the feeling this played an important role in our process of enabling R8.
From a birds perspective, we have multiple independent Gradle modules that mostly contain only a “few files”. Each module has a specific purpose. We have so-called “libs” which are mainly used to provide smaller utilities or domain-specific logic. Those libs can be used across the app. A typical example of this is our
lib-logging lib. This library can be added to literally any other module to provide logging functionality.
Next to our “libs” we have so-called “features”. Features, on the other hand, provide full rich features to the app. An example of such a feature is our
feature-ride-creation module. This feature provides the whole functionality to let users create a ride. This includes not only domain-related logic but also UI-related logic.
And yes, before you ask, it is always hard to distinguish between “what is a lib and what is a feature” 😁.
Besides our well-modularized modules, we still have two “legacy” buckets. The
libraries module and the
lib-core module. Those modules still contain a lot of functionality and it needs to be extracted step-by-step to their own modules.
Anyways, back to the R8 😅
We made use of our multi-module project when we started to enabling R8. Instead of having one big tasks “Enable R8 in the app”, we decided to split it up into mutiple (smaller) tasks.
First, we enabled R8 by default but added a configuration file to each module to keep all the classes (methods, variables, etc.) for that specific module. In this context, keep means to not removing or obfuscating code. Afterwards, we removed the configuration file for a handful of modules to test if our general idea works. It turned out that it worked. Pretty well even!
Now, as a second task, we can go over each module and remove the configuration file and literally test every single library or feature (isolated) if R8 doesn’t break anything. If it works, nice, commit the changes, and check the next module. If it doesn’t work, keep the configuration file for now, create a new ticket to tackle this specific module later, and move on.
We can literally iterate step by step from our base implementation to improve R8 (wow, it seems we are doing Scrum right 😱).
Based on the second task, we might have one or more follow-up tickets that requires a deeper look to enable R8 for them. But this is fine, having already most of the code removed and obfuscated helps to have a better product (app).
While writing this, we are between task 1 and task 2. R8 is enabled, a handful of modules will already be obfuscated, but other modules require to be tackled (hopefully soon).
Another nice side effect of this approach is that we can slowly test R8 in production. What I mean by that is that we can see what “changes” we have by enabling it. For example, we might have a slower release pipeline on our continuous integration server. We might be doing something wrong with the mapping file or something else we don’t see yet. But we will find out 😉.