I’m the author of checksum-dependency-plugin. The purpose of this article is to convince you to use checksums and/or PGP signatures in your build scripts.
The article focuses on Java ecosystem (Sonatype OSS repository, Apache Maven, Gradle); however very similar principles are used with great success in other dependency managers.
Even though checksums speedup build execution, I treat it as out of scope for the article (unless someone proves it should be in scope).
Note: I’m not a security expert, however I do my best to represent things in a proper light.
I don’t provide an extensive list here, however below are the articles that are relevant. It does not mean I fully agree with all the contents, however I think they are worth reading. Please take into account publication dates as certain bits might be just out of date.
The need for verification
Virtually all the software is using dependencies in one way or another.
For instance, let us consider the following:
You declare a couple of lines, and the build system downloads the required dependency transparently for you. There’s a catch: what if somebody has replaced the file at the Maven Central repository or intercepts the network request?
If the log4j-core.jar is replaced, then you will be executing arbitrary code that you have just downloaded from the Internet.
You don’t want to execute arbitrary code on your machine, so you want to verify if the downloaded dependencies are “what you think they are”
Note: there’s no “ultimate security level”. It might turn out your computer was rooted at the factory, and all your keystrokes are logged and silently sent over bluetooth. You never know unless you completely disassemble your hardware.
On the other hand, you don’t want to grab random jars from the Internet and execute them. So you need to figure out “good enough” security level that suits your workflow.
For instance, I don’t disassemble my digital piano to check if it is trying to steal data from me. I just assume that risk is too low. Someone might scream aloud, but I’m willing to trust the official release managers of Apache Log4j. That is I’m ok to execute a jar if I know it was produced by Apache Log4j project. Your preferences might vary, however you still need to identify a set of identities you trust, and you need to establish a way to verify that the artifacts were produced by the trusted parties.
Maven Central, JCenter and others are secure. It is just enough security
You should not trust neither Maven Central nor JCenter.
Note: even if you inspect GitHub sources for the library in question that does not mean the published jar files were built from the source files you’ve inspected.
Here’s a recent case, when a malicious jar file was published while the sources looked OK: https://blog.autsoft.hu/a-confusing-dependency/
Here’s a case for NPM: https://news.ycombinator.com/item?id=14901566
Note: it does not mean JCenter is bad. It just means “source code available” is not enough to trust the jar. The case could have been prevented if PGP was used to verify the jar.
Level 0: HTTPS
Rule #0: please never use http:// URLs for repositories. The connections over plain HTTP are not that hard to intercept, so just use https://
JCenter will deny HTTP requests
Level 1: checksums
Isn’t https enough? When you use https://repository.acme.com/… URL it typically means the server has to provide a valid certificate for
repository.acme.com domain. This is good provided you never add “root certificates” to the keystore.
Here’s how you verify SHA1:
$ openssl dgst -sha1 log4j-core-2.11.1.jar
My computation seems to agree with the value that Maven Central shows to me. Does that mean the file is good to use? Unfortunately, it does not.
If someone intercepts my requests to https://repo1.maven.org, then they can intercept requests to
.sha1 file as well and compute the checksum on the fly.
Does that mean checksums are completely useless? It does not. Checksums enable one to verify if the file was downloaded properly.
As you might guess, manual verification of the checksums is not practical.
Certain build systems (e.g. Bazel) require every dependency declaration to be accompanied by its checksum:
name = "com_google_guava_guava",
artifact = "com.google.guava:guava:18.0",
sha1 = "cce0823396aa693798f8882e64213b1772032b09",
sha1_src = "ad97fe8faaf01a3d3faacecd58e8fa6e78a973ca",
In other words, the expected checksums need be declared somehow. Unfortunately, neither Maven (as of 3.6.2) nor Gradle (as of 5.6.2) allows one to declare checksums that will be verified during dependency resolution.
Implementing checksums via plugins
If you are using Gradle, you can use checksum-dependency-plugin.
However, there’s a question: where do you get “the expected checksum” from? It should be taken from a place you trust.
For instance, Apache Maven download page has links to the expected SHA-512 of the source and binary archives. However it would be very surprising to see that every library author publishes the expected SHA-512 of all the library versions. It would be cool, but it is currently not complete.
A semi-solution would be to download the file somehow (e.g. from the server you trust using the machine you trust), compute the checksum, and bake it to the build script. Then it would warn you if someone else gets another checksum for exactly the same artifact.
Level 2: PGP
Checksums enable you recognize when the file differs, however they do not link artifacts to the trusted identities.
Apache Software Foundation provides https://checker.apache.org/ that enables you to type a checksum and it tells you if the checksum is a part of the official release. Note: Apache focuses on source code releases, so it is likely you won’t find a match for a jar file.
Luckily, PGP helps to solve that issue.
TL;DR: PGP allows to create cryptographic signatures, so everybody can verify that the signature was created by the owner of the private key. In other words,
log4j-core-2.12.0.jar.asc signature is “impossible” to forge (see “malicious PGP identity” below).
Note: PGP signs hash digest of a file, so it is important that strong digest is used. Here’s a case when PGP signature uses SHA1 checksum which is not that secure. Here’s a PR to Gradle to use SHA256 when signing artifacts.
If you use checksum-dependency-plugin, you can declare:
<trust-requirement pgp='GROUP' checksum='MODULE' />
That would mean the plugin would ensure that artifacts from
org.apache.logging.log4j group must be signed with PGP key of 3595395eb3d8e1ba.
Note: it is the default mode for
checksum-dependency-plugin. It tries to use PGP for dependency verification, and if the dependency is not signed (for instance, Gradle Plugin Portal does not allow to publish PGP signatures as of 2019–09–09)
Malicious PGP identity
New PGP keys are very easy to create, so you shouldn’t trust each and every PGP key.
However, keep in mind that Maven/Gradle never verify PGP identities during artifact resolution, so if you add a requirement for PGP keys, then you make your build more secure even in the case where you have never met the key owner in person. If you require a valid PGP signature, it makes it much harder for an attacker to forge a malicious artifact, on the other hand, you don’t really need to update the set of PGP ids often. You can update library versions, and existing PGP keys would likely match.
Verification of PGP identities
https://www.netbsd.org/developers/pgp.html#sign-recommendation suggests you need to meet key owner in person, and verify their ID. Unfortunately that is not always possible.
For Apache projects you can find PGP keys that are used by release managers in the relevant KEYS files (see https://www.apache.org/dist/logging/KEYS ).
If project page fails to list the PGP key, you can meet project lead in question and/or file an issue to publish project signing key in the README.
The release manager might happen to sign Git tag. For instance, you can see that JaCoCo 0.8.4 is signed with key CB43338E060CF9FA.
https://keybase.io/godin shows that that the owner of that key controls GitHub account https://github.com/Godin, Twitter account https://twitter.com/_Godin_, and so on. After investigation of the listed accounts, it might look ok to just trust that key. However, beware that all those accounts might be fake.
Level 3: PGP+checksum
If a simple PGP is not enough, you might want to use both PGP and checksums.
The declaration for checksum-dependency-plugin would be:
<trust-requirement pgp='MODULE' checksum='MODULE' />
It would be a bit more complicated to support (every dependency update would require to add new checksum), however SHA-512 would prevent unexpected changes, and PGP would simplify the review (you see that old and new library are signed with the same PGP key).
Level 4: who verifies the verifier?
I’m not sure about Maven, however I don’t think Maven (as of 3.6.2) allows you to block the execution of a plugin if its checksum does not match expectations.
Gradle allows one to resolve the classpath of the build script and compute checksums before the plugins are activated/applied. That enables you to download
checksum-dependency-plugin from the repository, and you can still verify that the file matches your expectations.
See full sample plugin README.
The important bit here is that the plugin is added via
settings.gradle script (or
settings.gradle.kts if Kotlin DSL is used). That enables plugin to intercept all the further resolutions.
The expected checksums are published in the README file. On top of that, you can check out plugin sources, build it in your environment and crosscheck the resulting jar file. Travis CI prints the expected checksums as well.
Level 5: sandbox repository
Your build is never secure if you allow it to download random code from the Internet. It does not matter if you verify checksums or not.
For instance, Gradle build script is written in Apache Groovy / Kotlin, and the script can issue a plain old HTTP(s) request to download extra dependency. That won’t be visible for “dependency verification plugins”, so you are exposed to security issues.
The ultimate solution would be to setup a proxy repository and allow your build machine to access that proxy repository only. That would intercept all the possible resolution requests, and the proxy repository can decide if a certain dependency is allowed or not.
That is great if you host Artifactory and/or Nexus Pro repository. However proxy approach is not very feasible for open-source development.
Just in case, Artifactory configuration can be found at “But Maven just goes out there and brings stuff”
Nexus Pro can work as a proxy repository that transparently verifies PGP signatures for all the resolved dependencies
Does PGP really help?
TL;DR: it does.
Consider a case from https://blog.autsoft.hu/a-confusing-dependency/
(Un)fortunately the dependency in question (com.github.adrielcafe:AndroidAudioRecorder:0.3.0) has been removed from JCenter.
However the article mentions
Created by the obviously fake jakewhaarton
Let us try adding a random JakeWharton / picasso dependency:
}$ ./gradlew help
checksum-dependency-plugin identifies the issue and fails the build:
FAILURE: Build failed with an exception.* What went wrong:
A problem occurred configuring project ':bitcoinj-wallettemplate'.
> Checksum/PGP violations detected on resolving configuration :bitcoinj-wallettemplate:runtimeClasspath
No trusted PGP keys are configured for group com.jakewharton.picasso:
com.jakewharton.picasso:picasso2-okhttp3-downloader:1.1.0 (pgp=[80c08b1c29100955], sha512=[computation skipped])
No trusted PGP keys are configured for group com.squareup.picasso:
com.squareup.picasso:picasso:2.5.2 (pgp=[80c08b1c29100955], sha512=[computation skipped])You might want to add -PchecksumFailOn=build_finish if you are brave enough
It will collect all the violations, however untrusted code might be executed (e.g. from a plugin)
We see that two unknown dependencies appear in the build script (picasso2-okhttp3-downloader and picasso). We see that both dependencies are signed with the same PGP key: 80c08b1c29100955
How can we tell if the key belongs to a true person or a fake one?
Luckily for us, Keybase reads that the owner of key 80c08b1c29100955 controls quite a few accounts, and the accounts look decent.
Even though I have never met Jake in person yet, the above is quite convincing for me to trust key 80c08b1c29100955 for com.jakewharton.picasso and com.squareup.picasso packages. Well, it is not clear why Jake releases under
com.squareup.picasso, so it needs extra investigation, however https://github.com/square/picasso/graphs/contributors shows that Jake contributes to that library a lot, so it looks safe.
Note: checksum-dependency-plugin creates an updated
$rootDir/build/checksum/checksum.xml file so you can compare it with the current one. If you are brave enough, you might add
-PchecksumUpdate so the root
checksum.xml would be updated automatically.
This case is quite trivial: a single Keybase search results in a rich identity. Many thanks to Jake for verifying the identity via Keybase.
Frequently Asked Questions
Which security level should I use?
If you can afford a proxy repository, go for it, and configure the set of trusted dependencies (~ Level 5).
If you work on an open-source project/library/application, I would suggest you use “Level 2: PGP” for verification of your builds.
I’m using https and it is just fine
It is good if you use https for all the repositories, however it does not protect you from a consuming a jar file released by a fake identity.
PGP is useless without “web of trust”
That depends. At some point you need to have a way to identify release manager you trust. That can come from your web of trust. However, sometimes keybase.io verified identity is good enough.
It might look like “web of trust” would enable you to verify artifacts, however it looks like no-one ever uses PGP web of trust. See https://blog.filippo.io/giving-up-on-long-term-pgp/
PGP seems to be the less evil way to verify artifacts for projects where you can’t afford to have a proxy repository. PGP makes your builds more secure even in case you never participate in a Key signing party.
I implement a library, so I do not need dependency verification
This is false. Even if your project is just a library, you still do not want your computer to execute arbitrary code from the Internet. You need to verify dependencies to keep your Bitcoins and passwords safe.
PGP is useless: you still suggest to just trust Twitter/GitHub/… accounts
I do not suggest that you trust a Twitter account. In fact, there’s no way to verify if the jar in question was published by https://twitter.com/JakeWharton or not. Jake does not publish checksum for each and every released jar. So how should I verify if the jar was released by true Jake or fake Jake? PGP allows me to verify if the jar was signed by owner of key 80c08b1c29100955, and it is hard to provide a malicious signature with the same key id.
Note: https://www.netbsd.org/developers/pgp.html#sign-recommendations says that you should verify not just key id, but other key properties like “key length”.
Should I verify POM files?
Gradle (as of 5.6.2) does not treat
.pom files as artifacts, so it does not fire the “dependency resolution” listener for them when resolving regular dependencies.
However malicious pom file can’t harm much: it does not contain executable code, so it can’t result in remote code execution. Of course it might result in a different set of build dependencies, however https://reproducible-builds.org/ could help you to verify if builds on different machines result in the same binaries.
io.spring.gradle:dependency-management-plugin resolves and processes
.pom files as regular artifacts. That resolution is can be intercepted and verified. So you can validate POM files, however it requires non-trivial effort and I do not think it adds much.
I do not find
.asc signature for a dependency. What should I do?
It happens that even very popular libraries fail to ship PGP signatures. When you see that please ask the project to sign the releases.
For instance, here’s an issue I’ve filed for ObjectWeb ASM: https://gitlab.ow2.org/asm/asm/issues/317878. It turns out the lack of PGP signatures was not expected by the developers.
A newer dependency version is signed with a completely different key. What should I do?
You need to figure out if the new key is the official release signing key for the artifact in question. Note: the same binary can be signed with multiple keys, however all of the keys might belong to fake identities.
So you should ask project team to clarify which PGP keys are mandatory to sign official releases, and you should not blindly trust any new keys you meet.
Clarification issue to Kotlin team: https://youtrack.jetbrains.com/issue/KT-33781
How would I know if PGP key can be trusted?
Verification of PGP identities section above.
The artifact is signed, and PGP key is listed on the project website as the official one. Does that mean the artifact is safe for use?
Unfortunately, valid signature does not mean the artifact is safe to use.
Valid signature means that the artifact was released by the right person, however keep in mind that a wrench costs just $5
I’m library author. Should I use PGP?
If you release a library, please ensure you sign your releases, and please publish PGP key ids at the project webpage. That would enable consumers to verify if they use the official releases or not.
You might want to verify dependencies of your library as well, especially in case you shade/bundle artifacts. For instance it is common that ObjectWeb ASM is repackaged and shaded. However you do not want to ship malicious code, so you need to verify if the version in question is the official one
I’m using version range. Should I ignore the verification then?
Please consider making your builds reproducible. That is version range is just fine provided the resolution ends up with a predictable version.
On top of that, even if you want to use ranges like
1.+ you might still get some security out of PGP verification. PGP keys change not that often, so you might just list all the possible keys, and automatic updates would play well with the verification.
Ok. I want to implement dependency verification in my Gradle project. What should I do?
The steps to integrate the plugin are:
- Add relevant section to
buildSrc/settings.gradleif you have one)
- Collect current dependencies. You can use
./gradlew allDependencies -PchecksumUpdate.
allDependenciestask comes with the plugin
-PchecksumFailOn=build_finish -PprintChecksumto your CI configuration. That would collect all the violations and print the updated
checksum.xmlcontents to the build output.
Note: by default the plugin fails on the first violation. That ensures your builds are secure (you don’t accidentally pull untrusted artifacts), however that is painful for CI (you don’t want to add new dependencies one by one). However you might want to refrain from using
-PchecksumFailOn=build_finish for “deploy CI jobs” because you do not want to execute untrusted code in an environment that is capable to publish artifacts.
Is checksum-dependency-plugin signed?
Nice catch. Unfortunately, Gradle Plugin Portal rejects PGP signatures. Please vote for the issue so Gradle plugins could be signed.
As of now, the builds of checksums-dependency-plugin are reproducible, so you can build your own jar and end up with exactly the same binary.
I develop an Android application and I see .aar files lack PGP signatures. Is that intentional?
.aar files could be signed as well, so please contact artifact authors so they provide PGP signatures.
If you are curious, here’s an example how aar dependency looks like.
Does anybody use checksum/PGP for dependency verification?
Debian has enforced PGP for releases for quite a while. CentOS requires PGP as well. Apache Software Foundation enforce that all official releases must be signed, and they recommend to use PGP for verification. PGP does not completely prevent malicious code execution, however PGP verification is better than nothing.
Here’s a set of PRs for Java applications and libraries.
- https://github.com/mvmike/min-cal-widget/pull/44 “HTTPS is enough”
- https://github.com/openhab/openhab-android/pull/1532 “Dependabot would stop working”