Introducing Package Diff
By convention, running
npm version will create a git tag with a format like
v1.0.0, but again, there isn’t a strict enforcement that git tag is the snapshot of code which is released to npm. In fact, it often isn’t, as an author may create a version, modify their README in a follow-up commit, before finally running
npm publish. There are even
prepublish scripts and
.npmignore files which ensure differences between the source code repository and package contents.
There is no requirement that code being uploaded in an npm module is equivalent to the code stored publicly in a git repository.
That’s what we built Package Diff for. That and inspiration from Mikeal Rogers:
This tool does essentially what Mikeal described: It downloads package tarballs from the npm repository (the only source of truth about a package’s contents), and recursively compares the differences between files within the package.
Each comparison is represented a permalink containing the package name and two version numbers. A table of contents to the left lists the files changed between the two versions. A list of each changed file, along with a few lines for context, are provided in the right column.
This tool can be used for many purposes. Developers can use it to view changes in their modules over time, perhaps to discover why it has increased in size. It can also be used in conversations to explain why a new package release has violated semver. It can even be used to provide a convenient GUI for describing diffs if a package isn’t hosted in a repository with a UI, such as a personal git repository.
The use-case we’re most excited about at Intrinsic is security audits. As mentioned before, the underlying git repository cannot always be trusted to show package differences (such was the case with the event-stream incident), which means a GitHub URL won’t always cut it. We’re excited with the potential for this tool to be used for malicious package postmortems, either accidental or actively malicious, since Package Diff will always show the exact differences between package releases. For example, here is a list of a few security issues introduced into modules:
email@example.com(npm advisory #322)
firstname.lastname@example.org(npm advisory #597, Snyk #20160901)
email@example.com(npm advisory #599, Node.js Security WG #394)
firstname.lastname@example.org(npm advisory #720)
Please experiment with Package Diff and reply in the comments with any interesting package comparisons you find!
This article was written by me, Thomas Hunter II. I work at a company called Intrinsic (btw, we’re hiring!) where we specialize in writing software for securing Node.js applications. We currently have a product which follows the Least Privilege model for securing applications. Our product proactively protects Node.js applications from attackers, and is surprisingly easy to implement. If you are looking for a way to secure your Node.js applications, give us a shout at email@example.com.
Original Banner Photo by Alberto Restifo.