Optimizing CI with Bazel and Kaniko in Cloud Build

Published in

Google Cloud - Community

7 min read2 days ago

In part 1 of this article, we covered a myriad of techniques and configuration options to help you achieve a faster build and deployment time. Here in part 2, we will delve into more advanced methods to further performance tune the continuous integration (CI) process using Google Cloud Build, ensuring that your overall build and deployment time decreases regardless of your software stack or complexity.

In this article, we’ll cover more advanced topics which include

What is Bazel and how it works
Using Bazel for deterministic builds
An example of how faaassst Bazel actually is
Setting up a GCS bucket with bazel for remote caching
Using Bazel from Cloud Build
Bazel: Exploring remote persistent workers
Using Bazel with parallel build steps
Considerations when using Bazel
Using Kaniko or Docker for caching
Summary

Overview of Bazel

Bazel is open-source, inside of Google it’s called blaze
Designed to work with both monorepos and multirepos
Bazel supports multiple languages : Java, JavaScript, Go, C++, Android, IOS
Bazel runs on Windows, macOS, and Linux
Bazel rebuilds exactly what’s necessary, no missed build steps resulting in stale binaries or unnecessary build steps making builds take longer than necessary
It caches all previously passed tests
Bazel is very fast at building, as it caches previous work and recompiles only the code changed
Starlark is a Python-inspired domain-specific language deeply integrated with Bazel, serves multiple crucial functions such as: defining BUILD files, implementing rules, creating macros, defining workspace rules, and facilitating Bazel aspects.

Bazel — How it works

Bazel calls your top-level directory a workspace
Bazel loads all packages in your target’s dependency graph
This includes declared dependencies, files listed directly in the target’s BUILD.bazel file
Bazel analyzes rules defined by the Starlark code by parsing BUILD.bazel files which declares all the dependencies that affects them, e.g. think collecting all C++ header files or jars on the classpath
A graph of actions is created once the BUILD.bazel has been parsed
Declare dependencies outside of your workspace like Maven artifacts or other GitHub repositories with source code with bzlmod
Bazel executes the compilers and other tools of the build
A package contains all your related files and dependencies and a file named BUILD.bazel
The elements of a package are called targets.
Most targets are one of two principal kinds, files and rules

Here are 2 packages, the workspace is src/my/app

src/my/app/BUILD.bazel
src/my/app/app.cc
src/my/app/core/input.txt
src/my/app/data/input.txt
src/my/app/tests/BUILD.bazel
src/my/app/tests/test.cc

But how fast is Bazel ? Blazingly faaassssttt ….

$ git clone https://github.com/GoogleContainerTools/distroless
$ bazel build //java:java_base_root_amd64_debian12

INFO: Analyzed target //java:java_base_root_amd64_debian12 (192 packages loaded, 9905 targets configured).
INFO: Found 1 target…
Target //java:java_base_root_amd64_debian12 up-to-date:
bazel-bin/java/java_base_root_amd64_debian12
INFO: Elapsed time: 130.010s, Critical Path: 6.54s
INFO: 113 processes: 46 internal, 67 linux-sandbox.
INFO: Build completed successfully, 113 total actions
bazel build //java:java_base_root_amd64_debian12
...

$ bazel build //java:java_base_root_amd64_debian12
INFO: Analyzed target //java:java_base_root_amd64_debian12 (0 packages loaded, 0 targets configured).
INFO: Found 1 target…
Target //java:java_base_root_amd64_debian12 up-to-date:
bazel-bin/java/java_base_root_amd64_debian12
INFO: Elapsed time: 0.268s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
FROM gcr.io/distroless/static-debian12
EXPOSE 8080
COPY - from=build /go/bin/app /
CMD ["/app"]

Setting up a GCS bucket with Bazel for remote caching

Bazel remote caching enables the distribution and reuse of build results across different machines, leading to substantial improvements in build speed.

Bazel works best with a persistent cache which can be situated either locally or remote.

Here we use a GCS bucket for caching, which prevents rebuilding dependencies when not required. You can use other backends too.

$ BUCKET_BLAZECACHE="bazelllllll-cache"
$ gcloud storage buckets create gs://${BUCKET_BLAZECACHE} - location europe-west6
$ gcloud storage buckets add-iam-policy-binding gs://${BUCKET_BLAZECACHE} - member='serviceAccount:888888888888@cloudbuild.gserviceaccount.com' - role='roles/storage.objectViewer' - project my-project
$ bazel build //…
 - remote_cache=https://storage.googleapis.com/${BUCKET_BLAZECACHE} \
 - google_default_credentials

Using Bazel from Cloud Build

While this image is supported by the Cloud Build team, it’s important to note that it might not include the latest Bazel updates.

Here’s an example cloudbuild.yaml snippet.

### Build Java App
- id: 'build_java'
name: 'gcr.io/cloud-builders/bazel'
dir: 'Bazelbuild'
entrypoint: 'bazel'
args: ['build', '//:HelloWorld_deploy.jar',
' - curses=no',
' - spawn_strategy=local',
' - remote_cache=https://storage.googleapis.com/my-cacheeeeee',
' - google_default_credentials',
' - test_output=errors']
waitFor: ['-']

Bazel remote execution and modes of operation

Bazel remote execution

Bazel’s remote execution feature allows build actions (the individual steps in the build process) to be executed on remote machines, while Bazel itself coordinates the build from a local machine. This offers several benefits:

Consistent Build Environment: Build actions are executed in a standardized environment on the remote machine(s), reducing inconsistencies that can arise from differences in local development environments.

Potential for Faster Builds: By distributing build actions across multiple remote machines, Bazel can potentially achieve parallelism and speed up the overall build process. However, this depends on factors like project structure, network latency, and the capabilities of the remote infrastructure.

Remote Caching: Bazel can leverage remote caches to store and retrieve build outputs (artifacts). If a build action’s inputs haven’t changed, Bazel can fetch the cached output instead of re-executing the action, significantly improving build times.

Shared build outputs: Teams can reuse build outputs, streamlining workflows and saving valuable time and resources.

Bazel persistent workers

Persistent workers are like keeping a tool handy and ready to go. Imagine you’re assembling furniture and you need a screwdriver frequently. Instead of putting it away after each use, you keep it nearby so you can quickly grab it again. Persistent workers do the same thing for Bazel. For example, when compiling Java code, instead of starting up the Java Virtual Machine (JVM) every time to run the Java compiler, a persistent worker keeps the JVM running. This saves time because the JVM doesn’t need to restart for each compilation task, making the overall build process faster.

Bazel remote persistent workers

Think of remote persistent workers like having a dedicated toolbox in a shared workshop. Instead of keeping the JVM running on your own computer (where Bazel runs), you have a separate, powerful computer that keeps the JVM ready for use. Whenever you need to compile Java code, Bazel can send the task to that separate computer, which already has the JVM up and running. This way, you save time and resources by not starting up the JVM on your own machine for each task.

Bazel leverages the open-source gRPC protocol to enable both remote execution and remote caching functionalities.

Remote execution backends usually consist of more than one server, typically with a frontend, a cache, then the actual workers on which the work happens.

The servers implement the execution, asset, and logstream APIs and add in nice UIs, a framework to build on which increases usability, and configuration wizards. Benefits include

Faster build and test execution through scaling of nodes available for parallel actions
A consistent execution environment for a development team
Reuse of build outputs across a development team

You can explore different execution and remote caching tools to achieve this — they are all implementations of the Bazel remote execution API.

Bazel Remote Builds on GKE using Cloud Build

This diagram illustrates a workflow on Google Cloud Platform (GCP), leveraging various Buildbarn services to build applications within a Kubernetes cluster (GKE).

Faster Bazel Builds with Cloud Build Parallel Steps

Cloud Build’s parallel steps can complement Bazel’s build process, allowing you to run multiple Bazel build targets concurrently.

Multiple Build Steps

You can background certain steps, to run tasks in parallel, such as code compilation
Then wait for those steps to complete
Sequential compiling can create bottlenecks
Through the UI you can dive into the logs

Cloud Build — Parallel Steps — Sample code

# Sample cloudbuild.yaml for using Bazel with Cloud Build in parallel steps
steps:
# Build Java App (Parallel Step 1)
- id: 'build_java'
name: 'gcr.io/cloud-builders/bazel'
dir: 'Bazelbuild' # Location of your Bazel workspace
entrypoint: 'bazel'
args: [
'build',
'//:HelloWorld_deploy.jar', # Replace with your target
' - curses=no',
' - spawn_strategy=local',
' - remote_cache=https://storage.googleapis.com/my-cacheeeeee', # Replace with your GCS bucket
' - google_default_credentials',
' - test_output=errors'
]
# Build Go App (Parallel Step 2)
- id: 'build_go'
name: 'gcr.io/cloud-builders/bazel'
dir: 'Bazelbuild'
entrypoint: 'bazel'
args: [
'build',
'//:GoApp_binary', # Replace with your target
# Add other Bazel options as needed
]
# Run Tests (after parallel builds)
- id: 'test'
name: 'gcr.io/cloud-builders/bazel'
dir: 'Bazelbuild'
entrypoint: 'bazel'
args: [
'test',
'//…', # Run tests for all targets
# Add other Bazel options as needed
]
# Add code to push container
waitFor: ['build_java', 'build_go']
# … other steps as needed (e.g., deployment)

Considerations when using Bazel

This article acknowledges that Bazel has a learning curve and may require some initial migration effort. However, the performance gains and efficiency improvements make it a valuable tool for optimizing time spent within CI pipelines in Cloud Build.

Kaniko — What is it

Kaniko is an open-source tool specifically designed to build container images within environments where you cannot directly access a Docker daemon.

kaniko caches layers, and creates a cache repository on Artifact Registry for you.

If the layer exists, kaniko will pull and extract the cached layer instead of executing the command. If not, kaniko will execute the command and then push the newly created layer to the cache.

Kaniko — Using it with Cloud Build

steps:
- name: 'gcr.io/kaniko-project/executor:latest'
args: [
" - destination=${_MYREGION}-docker.pkg.dev/${_PROJECT_ID}/quickstart-docker-repo/numbers-image:${BUILD_ID}",
" - destination=${_MYREGION}-docker.pkg.dev/${_PROJECT_ID}/quickstart-docker-repo/java-image:${BUILD_ID}",
" - cache=true",
" - cache-ttl=8h"
]

Summary

This is an article about optimizing CI with Bazel and Kaniko in Cloud Build. It discusses using Bazel for faster builds and Kaniko for building container images in environments without a Docker daemon. Bazel is an open-source build system that caches previous builds to speed up future ones. It can be used with Cloud Build for parallel builds. Kaniko can cache layers of container images, reducing build time.

In conclusion, by leveraging Bazel and Kaniko, developers can streamline and accelerate their CI/CD pipelines in Cloud Build.