Configuration caching deep dive
What is configuration caching?
Configuration caching is a fundamental building block in making builds faster, both from the IDE and the command line. It is a highly experimental feature in Gradle
6.6 which allows the build system to record information about the task graph once, and to reuse it in subsequent builds, thus avoiding the need to reconfigure the whole build again. This is also a continuation of configuration phase improvements where lazy configuration was introduced in order to avoid unnecessary work during this stage of the build. Needless to say, this is especially important for fast iterative development, a use-case the Android Studio team has been focused on.
The main goal of this effort is improving build speed. In benchmarks with the Santa Tracker Android project, we measured 35% total build time reduction in Android Studio (from 688ms to 443ms) for builds that have configuration caching enabled (measured on Linux with Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz). This is a graph showing mean total build time in milliseconds for 100 builds with and without configuration caching.
For some projects, configuration phase may be 10s of seconds, so savings can be much more significant. This overhead is the same whether you are running a clean build, an incremental build, or an up-to-date one. To measure how long configuration phase takes for your build, it is enough to run tasks in dry run mode e.g .
/gradlew :app:assembleDebug --dry-run.
In addition to avoiding running the configuration phase, configuration caching allows tasks from the same project to run in parallel. Previously, only tasks leveraging Worker API could run concurrently, but because configuration caching ensures tasks are isolated and without access to shared global state (such as the
Project instance) this behavior can be enabled by default. Also, dependency resolution results are now cached between runs which contributes to overall build time improvements.
How to try it out?
The configuration cache is at the moment in an experimental state and we’d love for you to try it out and provide us with feedback. In order to use it in your build, all applied plugins in all projects should be compatible. This is necessary in order to (de)serialize the task graph safely. You will probably need to update some of the Gradle plugins, so please see this issue for a comprehensive list of plugins that are supported. In case the plugin you are using is not on the list, please file an issue on their issue tracker and link to it from the Gradle issue.
4.1 version of the Android Gradle plugin (currently
4.1.0-beta05) is compliant, but if you’d like to pick up all bug fixes, please try out the latest
4.2 version (currently
4.2.0-alpha06). The Gradle version should be 6.6, and if you are using Kotlin please update the Kotlin Gradle plugin to the latest 1.4 version (relevant Kotlin issue). Finally, update gradle.properties with
# Use this flag sparingly, in case some of the plugins are not fully compatible
To verify that configuration caching is enabled, you should see
“Calculating task graph as no configuration cache is available for tasks…” in the Build output window in the Android Studio or in the command line in the first run. Second run should reuse the configuration cache, and it should contain
“Reusing configuration cache.” in the output.
How does it work?
To dive into details of configuration caching, we need to start with the configuration phase of the build. Even with configuration caching enabled, the first build will go through this stage. During this part of the build all projects that have been already included (while evaluating
settings.gradle) are configured by evaluating their build files. Typically, all plugins will be applied first and DSL objects will be instantiated. Following that, build file evaluation will continue and DSL objects will be assigned values you have specified. Once the build file evaluation is completed, Android Gradle plugin (and many others that follow the same pattern) will get their
Project.afterEvaluate callback invoked. It is during this callback that most of the work is done by the Android Gradle Plugin, including creation of the variants and registration of the tasks.
After the DSL evaluation and task registration, the next stage builds a task graph. The tasks that you’ve requested to be executed will be fully configured, and all tasks they depend on will also be configured. This continues until the leaf tasks without dependencies are reached. The output of this phase of configuration is a task graph that will be used by the scheduling mechanism in Gradle to run build operations. Once the task graph is complete, configuration caching will store it on disk (for Gradle
6.6 this is under root project
.gradle/configuration-cache directory). It is able to serialize all Gradle-managed types (e.g.
Provider) and all user-defined serializable types. At the end of this phase, every task will have its state fully recorded and persisted.
During the second build, assuming Gradle is able to reuse the recorded cache, the task graph for the requested tasks will be loaded, skipping DSL evaluation, task configuration etc. This means that all tasks will be instantiated and their properties will be loaded from cache. From this point onwards, build is almost the same as a non-cached one, with the benefit of running the tasks in parallel by default and reusing the dependency resolution results from the cache.
To guarantee correctness, Gradle keeps track of all inputs that impact the cached task graph, which include build files, the requested tasks, and Gradle and system properties accessed during configuration. Requesting a different set of tasks to run results in a different task graph, so creating a new cache entry is necessary. An example when state needs to be invalidated is if you change build files or
buildSrc, pass a different value for an environment variable or system property. To detect such changes, the build system creates a snapshot of build files used when the task graph was cached. Also, it detects if any tasks were not up-to-date in
buildSrc. Finally, any value that impacts the configuration phase should be wrapped in a Gradle-managed type. This allows the build system to keep track of variables which value was used during the configuration phase.
Using compatible Gradle API
All Gradle plugins applied in the build need to be compatible with configuration caching. Because of that, a new set of APIs was introduced. Here, we examine some of the constraints imposed by configuration caching and the APIs.
Using Project instance in task
The most common incompatibility in the Gradle plugins is usage of
Task.getProject() in the task action. With configuration caching, tasks will stop having access to this shared state in order to make them fully isolated. This is necessary as with the
Project instance, one is able to access
ConfigurationContainer and other objects that will not be populated in the cached runs, thus reflecting invalid state. There are a number of replacement APIs introduced, focused on lazy object creation like
ObjectFactory, ones that can be used to obtain project file system layout information like
ExecOperations in case you need to launch processes in your build. Here you can find a comprehensive list of APIs to migrate to.
Accessing Gradle/system properties or environment variables
What happens if you’d like to use system properties, Gradle properties, environment variables, or additional files to specify build logic inputs? Changes to build files are already tracked by the build system, but any additional value that impacts the task graph should be obtained by using
ProviderFactory API. The example below shows how to obtain
enableTask system property value that impacts configuration, and how to obtain system property
anotherFlag which is just a task input. If the value of the former one changes, cache is invalidated, and if the latter one changes cache is reused and the task will not be up-to-date.
Under the hood, Gradle keeps track of value providers that were resolved during the configuration phase, and each one of those is considered a build logic input. Also, it is not possible to resolve a provider unless
Provider.forUseAtConfigurationTime() is invoked, making it hard to introduce accidental configuration phase inputs. As already mentioned, Gradle will invalidate configuration cache if any of the build file changes, so this together with
ProviderFactory API ensures capturing everything that impacts the task graph.
Sharing work between tasks
In case you’d like to share some work between tasks e.g avoid connecting to a web server multiple times or avoid parsing some information many times, shared build services are a configuration caching compatible way of implementing that. Similarly to tasks, build services have inputs, and those will be serialized in the first run. Cached runs will simply deserialize the parameters and instantiate build services required by the tasks. Additional benefit of build services is that they fit nicely with the build lifecycle. If there are some resources you’d like to release once the build finishes, implementing
AutoCloseable in your build services is enough to accomplish that. Adding build listeners is not compatible with configuration caching as those cannot be safely serialized to disk.
Lessons from Android Gradle Plugin migration
While working on making the Android Gradle plugin compatible with configuration caching, there are a number of things we learned which plugin and build script authors may find useful.
Firstly, do not be discouraged if you see something like this in the build output once you enable configuration caching as many issues are duplicates and can be fixed with little effort:
428 problems were found reusing the configuration cache, 4 of which seem unique.
We’ve had a number of issues that we easily resolved by migrating to new APIs. For instance:
Check for a replacement API in case you are still using a project instance in your tasks. For most of them there should be an API that is compatible, and migration should be straightforward.
Another takeaway is avoiding the creation of non-serializable or expensive objects as soon as a task is created; instead create them only when needed in the task action. E.g in the example below we do not have to force Handler type to be serializable, as we create it only when needed:
When authoring tasks, make sure that task inputs correctly reflect everything the task needs during execution. Avoid accessing ambient objects or anything else reachable from the
Project instance. E.g. if your plugin creates a configuration, pass it as
FileCollection to the task. If you need the build directory location, record it in a task property:
A common pattern that the Android Gradle plugin used to rely on was initializing some object on the first usage, storing it in a static field and using build listeners to clean up the state once the build finishes. As already mentioned, shared build services should be used for this use case. See the example below on how to use it:
The last piece of advice is that when implementing custom serializable types, be careful what to serialize. Make sure not to serialize derived properties and make those transient or use functions. E.g this is necessary as otherwise in the cached run you will get a stale value for the