The Dependency Jungle

The Dependency Tree is Actually More of a Jungle. And it’s Haunted.

Dan Lorenc
Dec 22, 2020 · 7 min read

I was looking through the Kubernetes go.mod file, and noticed something weird. A few strange-looking dependencies that didn’t seem to belong. I’m still not quite sure what caught my attention about these specific modules — there are over 300 direct and indirect dependencies required to build Kubernetes — but these particular ones really didn’t make sense to me.

On a normal day I might have just gotten distracted and moved on, but I decided to really dig in here. I wanted to better understand the state of Go modules, the tooling around them, and what is going on in the dependency Jungle of cloud-native Go projects. Hunting down these dependencies was a perfect opportunity to do that.

After a few days, I think finally understand things a little better, and hopefully improved the state of a few projects. But the digging I did here, the more strange projects like this I found. I call them “ghosts”. They obviously aren’t really ghosts, but they’re half-dead dependencies - waiting to haunt anyone that dares to read the go.mod file or the output of go list.

This is Part One of a series that explains some interesting things I found in the module graph, some of the improvements I think I made, and some of the problems I faced along the way. For anyone hoping to try out a similar adventure through the Haunted Dependency Forest, I’d recommend bringing a machete.

Part 1: 99 Bottles Of Beer…

This first ghost was a small, innocuous looking repository named rsc.io/sampler, hanging out toward the bottom of the main Kubernetes go.mod file. I wanted to see what this module was, and how/why it was being used .

Step one was to find the source code. The Go tool supports Vanity Imports allow repositories to declare import names like this one, while still being served from GitHub. To find the source for this repo, we can use a simple curl command:

$ curl -L rsc.io/sampler?go-get=1 | grep go-import<meta name=”go-import” content=”rsc.io/sampler git https://github.com/rsc/sampler">

This meta tag shows us that the canonical location for this package is on GitHub. Opening it up in a browser is when things started to seem weird. This package had only a few files, and the latest commit appears to break it on purpose! What’s going on here?

The hello.go file has this for contents:

// Copyright 2018 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Translations by Google Translate. package sampler
var hello = newText(`
English: en: 99 bottles of beer on the wall, 99 bottles of beer, ...`)

That’s it. 99 bottles of beer on the wall. The sampler.go file is quite a bit longer, but it’s still complete gibberish. So why is this in Kubernetes? What could it possibly have been used for? If harmless things like this are lurking around, what else could be slipping in undetected? Can we remove it?

But Why?

The first step to cleaning something up is understanding why it was there. And I was having trouble even there. The go mod why command was providing me with nothing.

$ go mod why rsc.io/sampler
# rsc.io/sampler
(main module does not need package rsc.io/sampler)

So it’s in here, but it’s not needed? A quick search of the Kubernetes vendor directory showed that this was correct, this package is not pulled in by go mod vendor so k8s can build without it. That made me feel a lot better, but it still wasn’t good enough. If this module isn’t needed, we should be able to remove it, right? Maybe it was just added there by accident.

I tried a go mod tidy, and got nothing. The file was already tidy-ed. Maybe it was added explicitly for some reason, and the go tool preserved it? I tried removing it myself:go mod edit -dropreplace rsc.io/sampler

But k8s wanted me to put it back! As soon as I run hack/update-vendor.sh it reappeared! So there must have been some kind of reason it kept getting pulled in, even ifgo mod why can’t find it. Trying go mod graph gave me some better results. I’m still not completely sure of the difference between these two commands, but here’s the shortened result:

$ go mod graph | grep  rsc.io/
github.com/golang/mock@v1.1.1 rsc.io/quote/v3@v3.1.0
github.com/golang/mock@v1.2.0 rsc.io/quote/v3@v3.1.0
github.com/golang/mock@v1.4.1 rsc.io/quote/v3@v3.1.0
gonum.org/v1/plot@v0.0.0-20190515093506-e2840ee46a6b rsc.io/pdf@v0.1.1
rsc.io/sampler@v1.3.0 golang.org/x/text@v0.0.0-20170915032832-14c0d48ead0c
rsc.io/quote/v3@v3.1.0 rsc.io/sampler@v1.3.0

From this output we can see that k8s depends on golang/mock, which depends on rsc.io/quote/v3, which depends on rsc.io/sampler. In case you’re wondering, the quote module is just as funny as sampler.

Busting The Ghost

Now I could see where sampler was coming from, but still didn’t know why. Things were starting to make more sense though — the mock library is used for tests, which usually use a separate set of build tags and test dependencies don’t necessarily always get pulled into a vendor directory.

To get rid of it, I first took a look at the mock library on GitHub. Jumping right off the page, I saw that it was recently removed!

Great! This meant I should just be able to update golang/mock in k8s and drop this dependency. Thankfully, this was a pretty easy update: https://github.com/kubernetes/kubernetes/pull/97337. This PR ended up dropping both quote/v3 and sampler. Since these were never pulled into vendor, I don’t think this cleanup really had much of an impact. Still, it can’t hurt to remove things like this. At a minimum, they add constraints and complexity to the module resolution graph and make go mod work a little bit harder. Worst-case scenario, something malicious could eventually appear in these and make it’s way up into other programs. An update to an old dependency probably gets less scrutiny during review time than a change that introduces an entirely new one.

An Aside

I still wanted to understand why this was ever introduced into k8s. It was bugging me that a completely innocuous library like this, with seemingly no purpose, ended up in the dependency graph for such a critical piece of software. Reading the sampler repo itself didn’t give me much info. I could have just asked Russ Cox what it was for, but I wanted to try to find out on my own.

A Google search turned up a Hackernews discussion from a few years ago on the Using Go Modules blog post, showing how the new modules feature worked. It looked like rsc.io/quote/v3 and rsc.io/sampler were example repositories created to show how to use go modules.

That explains why the repos existed in the first place, but not why they ended up in golang/mock or farther. A quick bisect on the golang/mock go.mod file showed me the PR where they were introduced.

It was a PR to fix a bug where golang/mock had some trouble parsing major versions out of import paths when run in a module-aware context. So, the author added some tests of this support using the example repos from the blog post. Completely logical. So the golang/mock go.mod file contained a test dependency of a test dependency, and this repo propagated all over the Go dependency trees for anyone using mock.

So then why was it removed? Why was I able to just update? I initially guessed that it was introduced as some kind of accident and quickly deleted. That wasn’t the case, so why did it get deleted? Looking at the eventual PR to delete this, I stumbled on an even more interesting tidbit of Go history.:

That PR linked to an issue:

which linked to another issue:

Remember the vanity URLs I described at the start? Russ Cox was hosting his vanity import redirector on Google App Engine, and it was built with a very old version of Go (1.6). At some point, App Engine dropped support for this version of Go. This ended up taking down his redirector, so Go tooling couldn’t find the rsc.io libraries and builds of golang/mock failed! The dependency was removed to fix this, not to actually clean up the dependency trees.

And, interestingly enough, the rsc.io/quote test case was never actually deleted - it was merely commented out! So while I was able to remove this ghost from my dependency tree, it could reappear at any time. This wasn’t really the closure I was hoping to find, but at this point I had found enough other (potentially more serious) threads to pull on that I decided to move on to these. The next part of the series will cover a couple actual CVEs I found lurking around.

The Startup

Get smarter at building your thing. Join The Startup’s +788K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +788K followers.

Dan Lorenc

Written by

Software Engineer at Google

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +788K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app