Golang Modules & Immutability

Daz Wilkin
Google Cloud - Community
5 min readJul 10, 2019

Yesterday, I wrote a summary of my recent switch to Go Modules. In the conclusion, I wrote that I’m moving to a single ${GOPATH} across my projects. One of the advantages of Modules is that, a package version should be immutable. This implies that, once you’ve pulled the package once, you should never (have to) pull the package again.

But, of course, that works if you only use one machine. But, what, for example, happens when you use Docker? Is there a way to extend this to Google Cloud Build?

Docker Build

Let’s add a distroless Docker Build to the example. Create this Dockerfile in ${WORKDIR}:

FROM golang:1.12 as buildRUN printf "[timer] start\t%s\n" $(date +%s%N)COPY ./foo /foo
WORKDIR /foo
RUN GO111MODULE=on GOPROXY=https://proxy.golang.org go build foo
RUN printf "[timer] end\t%s\n" $(date +%s%N)FROM gcr.io/distroless/baseCOPY --from=build /foo /
CMD ["/foo"]

Here’s the results of running this 10 times ensuring Docker doesn’t cache layers between builds with --no-cache:

for t in {1..10}
do
docker build \
--no-cache \
--tag=foo:latest . \
| grep "^\[timer\]"
done
[timer] start 1562692830352570907
[timer] end 1562692833701779047
[timer] start 1562692837743692624
[timer] end 1562692841088710635
[timer] start 1562692845149766319
[timer] end 1562692848465842721
[timer] start 1562692852464160691
[timer] end 1562692855829662065
[timer] start 1562692859765082861
[timer] end 1562692863149075221
[timer] start 1562692867096511753
[timer] end 1562692870563710328
[timer] start 1562692874552237719
[timer] end 1562692877931096022
[timer] start 1562692881991442336
[timer] end 1562692885298613761
[timer] start 1562692889252781564
[timer] end 1562692892541429803
[timer] start 1562692896460210629
[timer] end 1562692899837684581

Which according to Sheets has a mean of 3357913011 and a standard deviation of 50407471 nanoseconds.

Docker COPY

This isn’t a great solution but it’s not possible to mount e.g. a Docker Volume (which would be more elegant) during Docker builds. Amend (or create another Dockerfile):

FROM golang:1.12 as buildRUN printf "[timer] start\t%s\n" $(date +%s%N)COPY ./go/pkg /go
COPY ./foo /foo
WORKDIR /foo
RUN GO111MODULE=on GOPROXY=https://proxy.golang.org go build foo
RUN printf "[timer] end\t%s\n" $(date +%s%N)FROM gcr.io/distroless/baseCOPY --from=build /foo /
CMD ["/foo"]

This time, we’ll run the test by mounting the golang-module-mirror into the container as /go. On the first run which we’ll discard, our mirror will be created, subsequent runs should utilize it:

[timer] start 1562694281225041674
[timer] end 1562694285811156939
[timer] start 1562694289319807596
[timer] end 1562694293776883985
[timer] start 1562694297206823489
[timer] end 1562694301530111138
[timer] start 1562694304936211074
[timer] end 1562694309252913601
[timer] start 1562694312650657096
[timer] end 1562694316880599577
[timer] start 1562694320360184122
[timer] end 1562694324700460041
[timer] start 1562694328110722373
[timer] end 1562694332475305957
[timer] start 1562694335900959477
[timer] end 1562694340500560831
[timer] start 1562694343962632837
[timer] end 1562694348394060653
[timer] start 1562694351868504357
[timer] end 1562694356285662159

Presumably because I’ve such a miniscule local cache (glog only), the time to COPY dramatically outweighs the benefit here. Mean of 4406618035 and a standard deviation of 117921035.6.

Google Cloud Build

I’ve written several times about Google Cloud Build. It’s an understated and compelling service that provides a mechanism to run a series of container images, pipelining the results from one to the next. Most (!) of the time, the service is used to build container images but it can be used to build many more assets.

I realized that, by default, using a library/Golang image rather than gcr.io/cloud-builders/go there’s a need to persist e.g. /go between build steps in order that 2 distinct golang steps can share a ${GOPATH}.

The solution to this is to use Cloud Build volumes (see below). In this example, the volume named go-modules is created and mounted on /go for the first step that needs it and the volume persists until no more steps reference it. For every step that references it subsequently, the cache of e.g. go get packages is retained.

2019–07–10 Update: I learned that Cloud Build permits an options section (link) that permits env and volumes sections to be defined once and be applied to each step rather than repeated in every step as I have below.

steps:- name: golang:1.12
env:
- GOPATH=/go
- GO111MODULE=on
- GOPROXY=https://proxy.golang.org
dir: foo
args:
- go
- build
- -o
- /go/bin/test1
- foo
volumes:
- name: go-modules
path: /go
- name: golang:1.12
env:
- GOPATH=/go
- GO111MODULE=on
- GOPROXY=https://proxy.golang.org
dir: foo
args:
- go
- build
- -o
- /go/bin/test2
- foo
volumes:
- name: go-modules
path: /go
- name: golang:1.12
env:
- GOPATH=/go
- GO111MODULE=on
- GOPROXY=https://proxy.golang.org
dir: foo
args:
- go
- build
- -o
- /go/bin/test3
- foo
volumes:
- name: go-modules
path: /go
- name: busybox
args:
- ls
- -l
- /go/bin
volumes:
- name: go-modules
path: /go

NB I think the GOPATH=/go is redundant here as this is the default|working directory for the Golang images.

NB Because Cloud Build uses /workspace as a working directory, our sources will be located in /workspace/foo but our packages etc. are persisted in /go.

NB Because our sources are in a subdirectory of /workspace, I’m using Cloud Build’s dir modifier to make /workspace/foo the working directory for the builds.

NB I’m using GOPROXY to utilize the Golang team’s Go Modules Mirror.

So, in this admittedly trivial example, each build (of the same thing) creates a differently named binary (testX) and puts this in the shared /go/bin directory.

The purpose of this though is to show how the packages are cached in /go/pkg once pulled for the first build:

gcloud builds submit \
--config=./cloudbuild.yaml \
--project=[[YOUR-PROJECT]]
...
starting build "22137c6c-df17-418b-8177-c231dd34b719"
FETCHSOURCE
...
BUILD
Starting Step #0
Step #0: Pulling image: golang:1.12
Step #0: 1.12: Pulling from library/golang
Step #0: Digest: sha256:3fee5835...
Step #0: Status: Downloaded newer image for golang:1.12
Step #0: go: finding github.com/golang/glog v0.0.0
Finished Step #0
Starting Step #1
Step #1: Already have image: golang:1.12
Finished Step #1
Starting Step #2
Step #2: Already have image: golang:1.12
Finished Step #2
Starting Step #3
Step #3: Pulling image: busybox
Step #3: Using default tag: latest
Step #3: latest: Pulling from library/busybox
Step #3: Digest: sha256:bf510723...
Step #3: Status: Downloaded newer image for busybox:latest
Step #3: total 5892
Step #3: test1
Step #3: test2
Step #3: test3
Finished Step #3
PUSH
DONE

However, if we remove|comment out the references to volumes and remove the busybox step since no files have been created ingo/bin and rerun the build:

starting build "3837eb7d-dae6-476e-9e72-67ee2f57de86"FETCHSOURCE
...
BUILD
Starting Step #0
Step #0: Pulling image: golang:1.12
Step #0: 1.12: Pulling from library/golang
Step #0: Digest: sha256:3fee5835...
Step #0: Status: Downloaded newer image for golang:1.12
Step #0: go: finding github.com/golang/glog v0.0.0
Finished Step #0
Starting Step #1
Step #1: Already have image: golang:1.12
Step #1: go: finding github.com/golang/glog v0.0.0
Finished Step #1
Starting Step #2
Step #2: Already have image: golang:1.12
Step #2: go: finding github.com/golang/glog v0.0.0
Finished Step #2
PUSH
DONE

You’ll see this time that the glog package is pulled each time the build is run. This is because we’re not sharing go/pkg across the build steps.

I suspect that this mechanism would be a time-saver if, for example, you have an existing multi-stage Dockerfile or you have a repo comprising several binaries and you want to emit different container images for each binary.

--

--