SBT cache in GCP Cloud Build

Alexey Novakov
SE Notes by Alexey Novakov
5 min readSep 1, 2019

Cloud Build job, as any other CI platform, may take long time to build your Java/Scala project using SBT due to downloading project dependencies from the web, until you plug some caching mechanism in. Cloud Build provides, so called builders, which is essentially a set of Docker images as steps to be used in your CI pipeline. Let’s use one of them, which consists of a pair of steps: save and restore cache.

Cache Builders

Cache builders is simple and elegant solution to take cache folder and save it into a Google Storage bucket under some name. The name of the cached archive is going to reflect current version of the project. Let’s look at this feature again later.

First we need to add both builders into our GCP Docker Registry. Let’s clone the repo and build both builders using gcloud CLI:

git clone https://github.com/GoogleCloudPlatform/cloud-builders-community.git
cd cloud-builders-community/cache
gcloud builds submit --config cloudbuild.yaml .

Above command should push both builders to your GCP registry. We can check image presence via:

gcloud container images list --filter=_cache                                                                                                                                       
NAME
gcr.io/<your proj id>/restore_cache
gcr.io/<your proj id>/save_cache

Create GS Bucket for cache

Cache builders are based on Google Storage buckets. We need to create a dedicated bucket to upload and download archived folder of SBT cache. How this looks in my GCP project:

Newly created bucket

Bucket name in my case is “sbt_cache”. We are going to use bucket name further in the cloudbuild.yaml file.

Take SBT project

I will use my own project from GitHub to demonstrate that SBT takes some time to download the dependencies. Feel free to use your project at this point, as steps we are going to add are generic. Cache restore/save should work for any SBT project and any other build tool, which is using disk to cache dependencies.

As we can see akka-slick-vs-http4s-doobie-service/build.sbt has quite a few dependencies:

"org.tpolecat" %% "doobie-core"      % doobieVersion,
"org.tpolecat" %% "doobie-postgres" % doobieVersion,
"org.http4s" %% "http4s-blaze-server" % http4sVersion,
"org.http4s" %% "http4s-circe" % http4sVersion,
"org.http4s" %% "http4s-dsl" % http4sVersion,
"io.circe" %% "circe-generic" % circeVersion,
"io.circe" %% "circe-java8" % circeVersion,
"org.typelevel" %% "cats-core" % "1.4.0",
"com.typesafe.scala-logging" %% "scala-logging" % "3.9.0",
"ch.qos.logback" % "logback-classic" % "1.2.3",
"org.postgresql" % "postgresql" % "9.4-1203-jdbc4",
"com.github.pureconfig" %% "pureconfig" % "0.10.1",
"eu.timepit" %% "refined-pureconfig" % "0.9.3",
"com.typesafe.slick" %% "slick" % slickVersion,
"com.typesafe.slick" %% "slick-hikaricp" % slickVersion,
"com.typesafe.akka" %% "akka-http" % akkaHttpVersion,
"com.typesafe.akka" %% "akka-http-spray-json" % akkaHttpVersion,
"com.typesafe.akka" %% "akka-stream" % akkaVersion,
"com.typesafe.akka" %% "akka-http-testkit" % akkaHttpVersion % Test,
"com.typesafe.akka" %% "akka-testkit" % akkaVersion % Test,
"de.heikoseeberger" %% "akka-http-upickle" % "1.23.0",
"com.lihaoyi" %% "upickle" % upickleVersion,
"com.lihaoyi" %% "ujson" % upickleVersion,
"com.softwaremill.macwire" %% "macros" % "2.3.1",
"org.scalatest" %% "scalatest" % "3.0.5" % Test,
"com.dimafeng" %% "testcontainers-scala" % "0.20.0" % Test,
"org.testcontainers" % "postgresql" % "1.9.1" % Test,
"com.storm-enroute" %% "scalameter-core" % "0.10.1" % Test

Create cloudbuild file

steps:
- id: 'restore cache'
name
: 'gcr.io/$PROJECT_ID/restore_cache'
args
:
- '--bucket=gs://sbt_cache'
- '--key=build-cache-$( checksum build.sbt )'
waitFor
: ['-']

- id: 'check cache'
name
: 'ubuntu'
entrypoint
: 'bash'
args
:
- '-c'
- |
ls -lah /workspace/.ivy2/cache | wc -l
waitFor: ['restore cache']

- id: 'compile'
name
: 'gcr.io/$PROJECT_ID/scala-sbt'
args
: ['-ivy', '/workspace/.ivy2', 'compile', 'test']
waitFor: ['restore cache']

- id: 'check files again'
name
: 'ubuntu'
entrypoint
: 'bash'
args
:
- '-c'
- |
pwd
ls -lah
waitFor: ['compile']

- id: 'save cache'
name
: 'gcr.io/$PROJECT_ID/save_cache'
args
:
- --bucket=gs://sbt_cache
- --key=build-cache-$( checksum build.sbt )
- --path=/workspace/.ivy2/cache
- --no-clobber
waitFor: ['compile']

There are 5 steps. Two of them are just for the demo: check files & check files again. Steps logic:

  1. Download file ‘build-cache-<some-number>’ if exists and unarchive it into the current folder, which is ‘/workspace’ by default.
  2. List files in the current folder ‘/workspace’ to see whether we have cache folder around. Our archive should contain such folder ‘.ivy2/cache’ with a bunch of files, i.e. cached project libraries.
  3. Compile project by running “sbt compile” and overriding ivy2 location using ‘-ivy /workspace/.ivy2’ option. We ask SBT to look for cache inside the just unarchived folder. That is the trick.
  4. List files in the current folder to show that cache is still there to be saved in the next step.
  5. Save cache to Google Storage bucket. Moreover, it won’t be saved in case the same archive name already exists. If checksum of ‘build.sbt’ file is the same, then it means our build dependencies didn’t change, i.e. we will get the same checksum number as before. It helps to skip the same cache upload again to the bucket, if using option ‘ — no-clobber’

Check build time improvement

Let’s build our project and check how save-restore cache is working.

cd akka-slick-vs-http4s-doobie-service/
gcloud builds submit --config cloudbuild.yaml .

You can either observe build process in your terminal console or go to GCP web-console.

Results:

  1. Very first build — no cache yet. We should see such message at ‘check cache’ step:
ls: cannot access ‘/workspace/.ivy2/cache’: No such file or directory

as we didn’t have a chance to store cache yet.

build time: 5m 21s

2. Second build. At this point we already have a cache in the bucket:

GS bucket with one archive in it

“check cache” step should print: 118 (number of libraries in the cache).

build time: 3 min 2 sec

Summary

Dependency cache helps significantly shorten the build time. Cloud Build has nice idea of image per step, which allows us to reuse already available images instead of building monolithic all-purpose image for every new project. I like this image-per-step feature very much, as it allows us to compose different images in our build. Cache builders is a perfect example of such compose-ability, as these builders were contributed by the community to the builders repo.

--

--