Bagan: Automate experimentations on Gradle projects with Kubernetes

Published in

Google Developer Experts

10 min readSep 5, 2019

Bagan is an experimental framework used to automate the execution and collection of the build information of Gradle projects using Kubernetes:

cdsap/Bagan

Bagan is a framework that helps to automate the execution, reporting and collection of data with different types of…

github.com

What problem solves?

Months ago, I wanted to compare the impact of R8 in Android projects against Proguard. I started using Gradle Profiler and then using Talaiot to report the data. My problem was the limited resources on my personal laptop(8gb) and the time I have to wait to check the results. I explore some of the typical CI options, orchestrating the changes in different branches, but I finally came up with a custom solution more flexible.

Another problem to solve is the typical copy&paste from StackOverflow/Twitter of magical properties to speed-up our builds. The decisions in our configurations should be driven and backed by measurements/data and should match the different environments of development.

And that’s why Bagan was born, offering:

Scalable solution depending on the infrastructure provided by environments like Kubernetes Engine.
Quick Feedback on experimentation.
No extra configuration required from the client-side in the repository.
Extensible with other cloud solutions.
Support private repositories.

Executing Bagan

Once you have downloaded the repository, you need to set up the bagan_conf.json. There you can include different properties like the type of experiments you want to apply, the target repository or the resources you want to use in the Kubernetes environments.

Bagan is executed with the ./bagan command following the next format:

./bagan MODE COMMAND

Bagan performs the experiments in a Kubernetes environment. For each experiment, Bagan will create a Helm Release where it will run the build of the target repository applying the experimentation.

To report the information of the build, Bagan will inject Talaiot in the Gradle configuration of the project, and to collect data, uses InfluxDb as time-series database and Grafana as a dashboard visualization tool. Bagan can deploy a new cluster or use an existing cluster to execute the experimentation defined in the main conf file.

Example bagan_conf.json:

{
   "bagan": {
      "repository": "https://github.com/android/plaid",
      "gradleCommand": "./gradlew clean :assembleDebug",
      "clusterName": "bagan",
      "machine": "n2-standard-2",
      "zone": "asia-southeast1-b",
      "private": false,
      "ssh": "",
      "known_hosts": "",
      "iterations": 20,
      "experiments": {
         "properties": [
            {
               "name": "org.gradle.jvmargs",
               "options": ["-Xmx2g","-Xmx4g","-Xmx6g"]
            }
         ]
      }
   }
}

This configuration will create, on the configured repository, three different experiments for the Gradle property org.gradle.jvmargs with values “-Xmx2g”,”-Xmx4g” and ”-Xmx6g”.

Modes

Bagan uses modes to identify the environment which Kubernetes will use to set up the configuration and execute the experiments.

Current modes supported are:

gcloud: It uses the Kubernetes Engine solution by Google Cloud. It requires gcloud sdk installed. The gcloud tool will ask for the account and project id selected to run the cluster.
gcloud_docker: It uses the Kubernetes Engine solution, but all the necessary configuration for gcloud is encapsulated in a Docker image called cdsap/bagan-init. Use this option if you don’t want to install additional sdk.
standalone: Use this mode if you have configured the kubectl client locally with an existing cluster, independent of the environment. If you want to test Bagan locally (and you have enough machine resources), you can use Minikube.

Commands

Once we have selected the mode, we need to specify which command we want to execute in Bagan. There are two main groups of commands:

Meta commands: Commands offering a complete execution of Bagan. It may contain multiple single commands.
Single commands: Commands representing a single action to be executed in Bagan.

Examples of Meta/Single commands:

./bagan gcloud cluster: It creates the cluster with gcloud, creates the infrastructure in the cluster(Grafana, InfluxDb, services) and executes the experiments.
./bagan gcloud_docker experiment: Executes the experiments triggered by gcloud Docker image with the configuration provided by bagan_conf.json
./bagan standalone experiment: Executes the experiments in the kubectl configured in the host machine.
./bagan gcloud_docker remove_experiments: Removes the previous experiments in the cluster using the Gcloud Docker

Check more details about the commands available here.

Experiments

An Experiment is an Entity that represents a specific state of the target repository. This state is related to a different configuration of the build system or the control version system.

Currently, Bagan supports three different types of experiments:

Gradle Properties: Specifies different values for Gradle properties.
Gradle Wrapper Version: Specifies different versions of the wrapper.
Branch: Specifies different branches for experimentation.

To include experiments, we need to use the bangan_conf.json:

"experiments": {
        "properties": [
            {
               "name": "org.gradle.jvmargs",
               "options": ["-Xmx3g","-Xmx4g"]
            }
         ],
         "branch": [ "develop","master"]
      }

During the execution, the Kotlin class BaganGenerator.kt will calculate the different combinations of the types of experiments included in the configuration. For the above example the combinations are:

org.gradle.jvmargs="-Xmx3g" and branch develop
org.gradle.jvmargs="-Xmx3g" and branch master
org.gradle.jvmargs="-Xmx4g" and branch develop
org.gradle.jvmargs="-Xmx4g" and branch master

Later, in the section “Pod Experiment execution” we will enter in more details about how works internally the experimentation.

Dashboard

Once the execution is finished, you can check the results on the Grafana instance generated.

For every execution, Bagan generates a Dashboard with the configuration parameters:

Before the execution of the experiments, the deployments of Grafana and InfluxDb are created. The release of Grafana includes a provisioned data source for InfluxDb. Later, a Dashboard is generated with the configuration provided by the bagan_conf.json. The dashboard contains five panels:

Graph of the command group by experiment.
Percentile(80) of the experiments on the command configured.
Minimum build times by experiment and command.
Experiment Winner.
Information about the experiments.

To access the dashboard, you have to access:

http://IP:3000/d/IS3q0sSWz

To retrieve the IP, you can use the Bagan command:

./bagan gcloud grafana_dashboard

User and password by default are admin/admin.

Additionally, you can create your panels in Grafana. It follows the same scheme used in Talaiot being the primary points build and task:

Internals of Bagan

In this section, we are going to explore some of the internal concepts of Bagan. If you want to explore deeper, the Github repository contains more detailed information.

Kubernetes Infrastructure

Independent of the environment selected, the general infrastructure for Bagan in Kubernetes is:

To create the instances of Grafana and InfluxDb, we are using the Kubernetes package manager Helm. Helm is used to create experiments in Kubernetes too.

In case of modes gcloud and gcloud_docker, additional Cluster Role Binding objects are created to be used by tiller/Helm.

For gcloud and standalone we can use kubectl as a command-line interface for running commands against Kubernetes clusters:

Moreover, we can use the Google cloud console, https://console.cloud.google.com, where we have a user interface to manage the cluster:

Regarding Helm, we can use helm commands in the host machine:

Pod Experiment execution

Previously, we saw that BaganGenerator.kt creates the experiments to be executed. Each experiment generates a Helm release with this structure:

values.yaml contains the information related to the experiment and the configuration provided in the Bagan conf.

repository: https://github.com/android/plaid.git
branch: master
configMaps: configmapexperiment1
pod: experiment1
session: sessionId
name: experiment1
image: cdsap/bagan-pod-injector:0.1.6
command: ./gradlew assemble
iterations: 10

configmapexperimentN.yaml contains the data of the permutation calculated for the experiment:

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ .Values.configMaps }}
  labels:
    app: bagan
    type: experiment
    session: {{ .Values.session }}
data:
  id: {{ .Values.name }}
  properties: |
               org.gradle.jvmargs=-Xmx6g
               org.gradle.workers.max=1

The execution of the build happens inside the pod created by podexperimentN.yaml. The Docker image used by the pod is cdsap/bagan-pod.

This pod is responsible for:

Fetch the target repository in a specific volume.
Inject Talaiot in the project
Apply the experimentation for Gradle Properties and Gradle Wrapper versions parsing the data from the configmap.
Execute the build given N iteration

The execution flow is:

When the Pod is running, it will execute the ExperimentController.kt (using kscript) that starts applying Talaiot in the main Gradle configuration file (groovy/kts):

publishers {
    influxDbPublisher {
        dbName = "tracking"
        url = "http://bagan-influxdb.default:8086"
        taskMetricName = "tasks"
        buildMetricName = "build"
    }
}

Later, the ExperimentController.kt will parse the data of the configmap and will apply the different experiments. Note that in the case of Branch Experimentation, the experiment will be applied in the Pod and not in the configmap.

During the execution of the experiment, we can check the output of the build at the Pod log associated with the experiment in Google Cloud console:

And the details of the resources used by the experiment:

Requirements by Mode

gcloud

jq: https://stedolan.github.io/jq/
gcloud: https://cloud.google.com/sdk/
kscript: https://github.com/holgerbrandl/kscript

gcloud_docker

jq: https://stedolan.github.io/jq/
docker: https://www.docker.com/

standalone

jq: https://stedolan.github.io/jq/
kubectl configured with an existing cluster

The cost of Bagan

In case you are using Google Cloud as Kubernetes environment, you should consider the impact in terms of cost($$) of the experimentation. Android projects are expensive in terms of memory consumption and creating multiples combinations with not enough powerful machine resources will cause the failure of the experiments:

Increasing the resources of the machine will help to succeed in the execution of the experiment, but it brings more consumption of resources a.k.a money. Remember that more permutations of the different experiments will generate more Pods. For example, this configuration:

"experiments": {
   "properties": [
      {
         "name": "org.gradle.jvmargs",
         "options": ["-Xmx3g","-Xmx4g"]
      },
      {
         "name": "org.gradle.caching",
         "options": ["true","false"]
      },
   ],
   "branch": [ "develop","master"],
   "gradleWrapperVersion" : ["5.5", 5.4"]
}

Will generate sixteen experiments:

There is no restriction on the machine used in Bagan, for small/medium projects we recommend the n2 series(cost per hour):

But, it’s up to you which type of machine you want to use for the experiments. You can check more details about the machine types here:

Machine types | Compute Engine Documentation | Google Cloud

Whether your business is early in its journey or well on its way to digital transformation, Google Cloud's solutions…

cloud.google.com

Next report represents Bagan executed six times using the n2-standard-8 machine with permutations of 4–16–24–16 experiments in a 20–30 iterations:

In case you are using Google Cloud only to create the cluster for Bagan, we recommend to remove it after the experimentation and evaluation of the results have finished.

Examples

Following examples are just simple use cases showing how to experiment in your projects.

Use Case 1

A simple example of a private repository with three modules.

Two experiments of Type Gradle Properties will be generated:

-Xmx3g
-Xmx4g

Result:

Use Case 2

Experimentation with Google project Plaid, exploring kapt properties.

Sixteen experiments will be generated:

Result:

We don’t appreciate, on Plaid project, significant differences using properties like kapt.incremental.apt and kapt.use.worker.api. However, you can notice the benefits of using caching on the Gradle Builds.

Use Case 3

Experimentation on Android Showcase project by Igor Wojda 🤖, with Gradle properties and different Gradle Wrapper versions.

Result:

Best times are using Gradle 5.6.1. No significant differences in the JVM args experiments.

Next steps

You have more detailed information in the repository like how to deploy custom images and how works the life cycle of Bagan.

We want to extend Bagan soon with more type of experiments like different Docker images(java versioning), applying abi changes, remote caching and support for multiple Gradle commands.

Stay tuned, and if you want, you can contribute, here is the repository:

cdsap/Bagan

Bagan is a framework that helps to automate the execution, reporting and collection of data with different types of…

github.com

Thank you very much for your time.

PS: Of course, if you have the opportunity, visit Bagan, recently was included as UNESCO World Heritage Site.

Bagan: Automate experimentations on Gradle projects with Kubernetes

cdsap/Bagan

Bagan is a framework that helps to automate the execution, reporting and collection of data with different types of…

What problem solves?

Executing Bagan

Internals of Bagan

Machine types | Compute Engine Documentation | Google Cloud

Whether your business is early in its journey or well on its way to digital transformation, Google Cloud's solutions…

Examples

Next steps

cdsap/Bagan

Bagan is a framework that helps to automate the execution, reporting and collection of data with different types of…

Published in Google Developer Experts

Written by iñaki villar

Responses (1)