JavaScript Modules: Welcome to My Emo Hellscape
tl;dr: unless all dependencies use the same module format, dependency tree of depth > 1 is so painful nobody does it. Nobody agrees on a solution to this problem. We are all fucked.
Update: There is an open Issue on bower/bower with a proposed solution to some of these issues.
Last week I had a small meltdown on twitter about npm’s future plans around front end packaging:
After a back and forth with @seldo (who is always exceedingly patient with me), I ended up even more emo than before.
If you haven’t read npm’s plan, let me summarize: “we’re going to borrow bower’s flat dependency tree concept and hope the problem will just solve itself.”
My twitter emo about npm stems from the fact that they don’t have a public plan for the 800lbs gorilla-in-the-room problem that makes browser JavaScript dependency management a hellscape: module formats.
Before I dive in to this topic, I’d like to establish some common understanding and terminology for discussion. If you have a comfortable knowledge of this space, just go straight to the section “The Browser JavaScript Package Manager” or if you just want to cry see “Abandon All Hope”. Otherwise http://butt.holdings:
Communities of browser JavaScript developers
The browser JavaScript language is a big tent. Although this doesn’t cover every community, I roughly divide most development work with JavaScript in the browser into three broad camps:
1. Front End
This development work occurs entirely after some other system has returned a page of HTML.
This developer is probably using jQuery although she might have ventured into Angular for its improved code organization.
She’s unlikely to use a package manager. Instead she visits the websites of popular libraries and uses their download links. If she is using a package manager, it’ll be bower, but that is much rarer than just downloading files from a library’s marketing site.
This group is the biggest and broadest for browser JavaScript. Conservatively 70%, but probably closer 95% of all people using JavaScript in the browser today.
2. Full Stack
This developer does everything in the above category and also writes the code that responds to HTTP requests with a dynamically generated HTML page. She’s probably using Rails or Node and is certainly familiar with package management from those environments. She’s still just getting browser JavaScript assets via download and dropping them into a /public folder. Less likely, she’s installing assets via Rubygems and exposing them with the Rails Asset Pipeline (awful) or using npm and browserify (acceptable).
She’ll be using jQuery and Backbone/Angular. I’d peg the size of this group anywhere from 25%-5% of browser JavaScript.
3. Full Frontal
This developer is doing all of the above two categories, except she’s architected her platform to avoid the pains of mixed environment rendering: the server application deals with data persistence, long running tasks, etc and the client handles all display rendering and user event handling. Possibly the client application is communicating with several services and those services also power the iOS, Android, and desktop client applications.
She’s very familiar with build tools, HTML, CSS, the DOM, and is looking eagerly at new ES6 language features and new browser technologies. Likely she’s using Ember.js or Angular.
At best 5% of JavaScript in the browser. Probably closer to 1%, but this group is the only one experiencing growth. Many of the #thoughtleaders have moved into this space.
Authorship and Consumption
Package managers have two primary users: library consumers and library authors.
Library Consumer
Wants to use other people’s work to build something larger. She already has a preferred workflow, build process, and module format for her code and wants to add external dependencies without additional format conversion incantations.
If she’s a “full stack” or “full frontal” developer, she’s familiar with how useful dependency management can be in environments where package location and structure is mandated and a single module format is baked in: It Just Works™. She wants that ease of use. She doesn’t want to download source code and hand-craft a special build. She doesn’t want bespoke artisanal build tools for each library she uses.
She’s looked at Browserify, but not everything is available as CommonJS.
Library Author
Publishes her work for others to use. She already has a preferred workflow, build process, and module format for her code.
She is keenly aware that bundling dependencies leads to consumers having multiple logic copies of common libraries in different module formats, but that not bundling means she’ll get 10 Github Issues each week from developers who didn’t load dependencies on their own and are confused (“when I load this it just says undefined is not a function, I think it’s broken”).
She wants to support as many people as possible without forcing them to change their preferred module and build process. She’s emo because there’s no way to do this with just source files. All of her libraries have a /dist directory checked into fucking source control, she’s has to research and build her own conversion tools, educate every committer about them, and keeps getting PRs targeting the build files.
All this keeps her up at night. Checking built code into a source control system? Was spending time building custom tooling around this really the best use of her time? Are special builds as part of the source even a good idea? Maybe she should publish a separate custom build tool just for her library, but really, does that help or is she just adding to the problem She thinks “I hear iOS developers are in high demand” and softly cries herself to sleep.
Source Code vs Executable Code
In server JavaScript (like many dynamic languages) source code is runnable as-is, you author and consume in the same format. There is no need to publish builds. Getting the source for a version of a package is the same as getting a build[1].
If you haven’t done much browser JavaScript, you might assume the same holds true. But nearly all libraries feature build steps for publishing. From the simplest (wrapping files in an IIFE and concatenating) to the very complex (passing source through traceur-compiler, tree-shaking to removed unused modules, concatenating external dependencies, and finally minimizing into a single package).
Browser JavaScript is a compiled language. Little code is authored exactly as it will eventually be executed.
Dependency Granularity
Dependencies come in two granularities: coarse and fine.
Coarse Grained
This is the “package”. Declaring coarse dependencies is all about getting files (and the right versions of those files) from the package management system onto a computer for development, build, or deployment. You’ll declare these in a manifest file (e.g. bower.json). Just having the coarse grained dependencies installed locally doesn’t bring them into your code’s runtime.
It should look like this:
cli-tool install lodash
Fine Grained
This is the module import process. Declaring fine grained dependencies is all about getting parts of coarse grained dependencies into the execution of your code. It might look like this:
var _ = require('lodash');
but is ideally much finer grained to make exact dependencies explicit and avoid bundling unused code:
import {clone, random} from "lodash";
Constraints and Trade-offs
YSlow is a great list of browser environment constraints to think about in development. For dependency management the two most important constraints are
- Minimize HTTP Request Size
- Minimize HTTP Requests
Frustratingly, these two constraints compete. For sufficiently trivial use cases you might think “just concat all JavaScript.” This works fine if you’re a front end developer or have a small full stack application. Large full stack application or almost any full frontal application benefits from being able to defer asset loading until the assets are needed. Loading assets for infrequently accessed parts of your application degrades the user’s experience of the site or application.
A great browser package manager needs to allow both patterns easily.
Modularity
Let’s start with the premise that modular code is good and that developing modularly results in beneficial emergent properties. If you don’t agree with that, package management probably isn’t a concern for you. If you think that modularity is good, let’s further agree you need a module format: syntax to define what code you need and a syntax expose your public API.
Sadly, the browser has five dominant and conflicting authoring formats for modules, matched by five consumption formats for modules: Globals, CommonJS (CJS), Asynchronous Module Definition (AMD), Universal Module Definition (UMD), and ECMAScript 6 (ES6) modules [2]
This will earn me some strange looks (except from @searls), but Globals are the only universally available and consistently usable module format in the browser. You don’t get the clarity of expression that comes with a focused dependency syntax, or space saving features like tree-shaking but you can’t knock globals for the It Just Works™ success.
CJS has clearly won in the node world. Interestingly, it did not win through the node ethic of “let a thousand flowers bloom”: it’s shoved down your throat. This fact allowed node and npm to blossom. Some decisions, it turns out, are just too important to leave to the masses.
Outside its node success, CJS is maybe 2% of browser JavaScript, via browserify.
AMD is the most popular non-globals format for the browser, possibly waaaaaay up at 3% of browser JavaScript. Its semantics are hideous and you will feel bad while using it.
ES6 modules are purely an authoring format today. The spec was finalized July 2014 and they cannot, at this time, execute in any environment. Axel Rauschmayer wrote a good high level summary of how they compare to other formats. My hope is that ES6 modules become the only module format one day.
UMD was an interesting experiment to see if we could solve the author/consumer module format problem. Ultimately it resulted in a yet another format and one that is egregiously ugly. Still, I consider UMD a success in the sense that it teased out an incredibly important fact we can apply to module interoperability: UMD works great when you have no dependencies in your lib
This is so incredibly important, that I’ll highlight it a second time: UMD works great when you have no dependencies in your lib.
This statement is true for any module format. Good modularity is not composed at a depth of 1, so if you want both “many small packages” and It Just Works™ your packages must be in a single module format from your application code down to the deepest internal dependency. This is the key to why Globals dominate the module space: everyone publishes globals. Globals are universal.
Maybe some other format will win, although, honestly, it’s been five fucking long years already. The web can’t afford to wait another year or two or five. You deserve better. Today.
The Browser JavaScript Package Manager
Let’s distill the above blather into a few pithy statements about browser JavaScript that you can tweet and argue about:
- JavaScript has many communities of practice with hugely divergent needs.
- JavaScript has many module formats, none of which have “won”, except the worst of them: Globals.
- JavaScript source is always compiled to something else.
- Modularity is a a good design pattern, especially fine-grained modularity.
- Module interop only Just Works™ at a depth of 1, but good modularity has depth > 1.
- The web has conflicting constraints around packaging, there is no “one true way” to resolve this.
- Authors and consumers shouldn’t need to know about each other’s processes.
- Handcrafted per-library build tools only add complexity to an already complex problem.
- Your preferred format has not “won”. Everyone thinks this. Everyone is wrong.
- Maybe one format will “win” in the next 1–5 years. Maybe. Probably we’re just fucked.
- We deserve a better experience. Today.
Given this, let’s sketch out key points for a package manager that works within these constraints to deliver real modularity to the browser today.
Authoring occurs in the author’s preferred module format
As a library author I declare my coarse dependencies in a manifest file once and never again.
For fine-grained dependencies I import using my preferred module notation. I expose my public API using my preferred module notation. I don’t add brittle glue code in the space between modules. I don’t add brittle glue code for N consumption formats.
Consuming occurs in the consumer’s preferred module format
When writing an application, I don’t need to know what module format my dependencies were authored in. I import them using my preferred module notation.
I’m happy to note my preference in my manifest.json file:
{
'module': 'cjs'
}
I don’t add brittle glue code in the space between modules or hand-craft a separate build manifest to make sure files are loaded in the correct order. I use my own preferred build process that matches the constraints of my project. If I’m a browserify fan, great. From my perspective everything in the registry is in CommonJS. If I’m betting on an ES6 future, awesome. Everything in the registry from my perspective is published as ES6 modules.
I never need to know about or deal with package formats. So, I never need to do:
cli-tool install packageA packageB packageC --module=es6
If you think this isn’t a MASSIVELY useful feature for both authors and consumers, try reading Github issues for almost any JavaScript project. Here’s a related issue I found at random in < 10 seconds of searching. Every author has to answer questions like this. All the time. Please, make this problem go away.
Conversion from authoring format to consumer format occurs on publish
To enable the above, publishing does not transmit authoring source code. It builds from source into multiple consumption formats. The transformation takes place though libraries in the cli-tool, so each library author doesn’t need to investigate and write their own transformation process.
If cli-tool can’t transform from one format to the other (e.g. circular dependencies that ES6 can handle, but CJS cannot or expressions in require statements that make resolution non-deterministic) warn the author and allow them to either fix or choose to skip publishing to a particular consumption format.
Builds are hosted
For the above three to work, a package registry would need to host builds. You could invert the relationship and have `cli-tool` convert from one modular format to another on install. This would let you point the registry to a source version control system (as bower and duo have), but this puts transformation errors in front of consumers, which is not where they can be properly addressed.
Clearly delineate server, build, and client packages in manifest and directory structure to minimize confusion.
In my experience this format is very clear to users:
.
├── browser_modules
│ ├── mocha
│ ├── react
│ └── underscore
└── node_modules
├── express
├── mocha
└── underscore
While this format makes them wonder what, exactly, ends up in their browser code, what is used in their server, and what is part of the build process
.
└── node_modules
├── express
├── mocha
├── react
└── underscore
Flatten the dependency tree, Let consumers resolve version conflicts
To address the asset size constraints of the web platform npm-style sub-dependences don’t work. Package installation needs to be flat and the consumer has to resolve conflicts on her own. Bower got this right, and npm is planning to make this change for browser JavaScript as well.
Offer a web-based build tool/package bundler for “front end developers”
All the neat package management tricks above are awesome for the subset of browser JavaScript users who are happy with build tools and command line interfaces. They’re the minority. Library authors are publishing to your registry anyway, do them (and their users) a solid: make the packages available for download in a global-style format. Replace the need for each library to create this system on their own. As an author I can tell you, publishing bundles takes up a stupid amount of release time.
Pages like http://jquery.com/download/, http://emberjs.com/builds/#/release and every “Download” button on every marketing site for every library can just point to the registry. Library authors without the time to make packages available for easy consumption by “front end developers” can stop worrying and just publish.
“Front end developers” get a single, reliable source for packages. To really hit this out of the park, let them assemble several libraries on the site, build into a single package, and make minification an option. These are problems everyone needs to solve on their own. Make this problem go away.
The extras
While you’re at it solve CSS, image, and font dependency problems (each of which could use their own 5000 word exploration) and include a working lock file mechanism. I would also like a unicorn.
Abandon All Hope
Right now you’re probably thinking: this kind of behavior doesn’t belong in a package management system. And you’re absolutely right. I can’t think of any other package manager off the top of my head that concerns itself, in anyway, with how fine-grained dependency resolution occurs or how a library’s exposes its public API.
This behavior belongs in the language proper. But that’s not the world we have.
After five years of community experimentation I’m pretty convinced this problem won’t go away until we all start using the same module format for both authoring and consuming[3] or some central broker makes publishing and consuming formats frictionless. Does this belong in a package manager? No. But it’s the best place we’ve got.
Woe unto we unlucky JavaScripters: No package manager or registry is really solving these problems yet. Bower gets the flat hierarchy right, but has no convention around project structure so everyone is writing their own shim code to get files from bower_compoents into their projects.
Duo and Bower both let you point to source control for packages, but source files aren’t builds so authors have to either create new source repos that are really just builds or include build files in their source. It’s a headache for authors and since the pattern isn’t consistent across all projects, it’s a headache for consumers.
Duo has a good transform-based interop story, but does this on the consumer side and mandates an authoring format (CJS).
Npm hosts packages, so as an author you can cobble together a system like this: programmatically use npm and publish multiple packages (lib-es6, lib-amd, lib-cjs, lib-globals, lib-globals-packaged), but unless all of your dependencies also do this you’re limited to a dependency depth of 1. Plus, now consumers need to learn which brand of your package they need. We could build a new-cli system on top of npm but in the best case we’ll pollute npm’s registry with thousands of mostly identical packages and earn the ire of the node community.
I’ve chatted briefly with some people at npm about these problems. They’re betting on “ecosystems” which just give official structure to the existing balkanization. I’m convinced they don’t really “get” the needs of the browser JavaScript community and just really, really really hope everyone starts using CJS.
Now, take up the pitchforks and flaming brands! Come tell me I’m an idiot.
Of, if this sounds like it could end your pain: go pester @izs and @seldo from npm, @MattMueller from Duo, and @satazor from Bower. Tell them to come up with some kind of plan (it doesn’t need to be this one) that is a bit more nuanced than “well, hopefully you kids work it out eventually!”
[1] Yes, yes. CoffeeScript people.
[2] Before you um actually me let’s get a common conversation out of the way:
Yes, your {INSERT PREFERRED MODULE FORMAT} “just works” for you. I’m very happy that you found a great workflow with {INSERT PREFERRED MODULE FORMAT} and that the tooling around it has matured to the point where you’re confident saying everyone should just get on board with {INSERT PREFERRED MODULE FORMAT}, stop using {INSERT OTHER MODULE FORMAT} (which clearly suffers from pretty foundational flaws, let’s be honest!), so we can all live in the happy future of a common module format. Yes, I did know that, in fact, {INSERT PREFERRED MODULE FORMAT} can compile down to {INSERT OTHER MODULE FORMAT}, {INSERT OTHER MODULE FORMAT}, _and_ {INSERT OTHER MODULE FORMAT} pretty easily. Yes, that’s really great for {INSERT OTHER MODULE FORMAT} users to have an excellent migration path to {INSERT PREFERRED MODULE FORMAT} or, heck, they can even keep using {INSERT OTHER MODULE FORMAT} if they want. Yeah, I also heard that {INSERT PREFERRED MODULE FORMAT} tooling author is a very smart {INSERT SEX/GENDER APPROPRIATE NOUN}, {INSERT SEX/GENDER APPROPRIATE PRONOUN} talk on modules at the last {INSERT RECENT JAVASCRIPT CONFERENCE NAME} was great. We all learned a lot about static analysis and how {INSERT PREFERRED MODULE FORMAT} is going to really solve a lot of problems. I’m feeling pretty convinced that you’re right and {INSERT PREFERRED MODULE FORMAT} has won.
[3] so: either a) in 2015 when we all collectively agree to move to ES6 for authoring which is the correct technical solution to this problem or b) never because ES6 is a technical solution that, like all other technical solutions, has to gain broad community support to Just Work™.