CC BY-SA 2.0 by MyAngelG.

Maybe you should stop using Grunt

For approximately 18 months, most of my time at work was spent writing grunt tasks, wiring them up, using them in other projects, building a meta-layer that glued everything together, and hacking the hell out of Grunt when I wanted to make it do something it wasn’t designed for. I have spoken at a local meetup about using Grunt (slides). But a few months ago, my team decided we no longer needed Grunt, and we haven’t looked back. Although Grunt is a reasonable solution in some cases, it was not a good fit for what we were trying to do. If you are trying to build a reusable SDK to share across projects, I recommend against using Grunt.

(If you are not trying to build an SDK, and you just want to set up your single project with a task workflow, then go ahead and use Grunt (or gulp). They will serve you just fine.)

My team builds an SDK for creating and deploying web apps at Opower. The developer-facing component of this SDK took the form of a set of commands you could run on the command line. Our desired workflow went something like this:

  • Run a yeoman generator to create a new project.
  • When npm install is finished, you have everything you need to get started.
  • Run commands like npm start to see your app running, or npm test to run the tests.
  • When you are ready to create something that can be deployed, push a commit to the master branch of the parent repo.

(Deployment and release management were handled separately.)

The key parts of that I’d like to focus on are how we make the commands like npm install, npm start, and npm test do something meaningful.

Enter Grunt

We initially gravitated towards grunt. There was a lot of excitement around it in the community. People and projects we respected were using it. We thought that it would provide a layer of abstraction for common command line tasks, like getting config, file IO, logging, etc. We also hoped that we could leverage the large library of existing grunt tasks. (No need to integrate with jshint when we could just drop in grunt-jshint.)

Grunt does do some things well. If you have a single project you’re working on, and you want to drop in pre-existing tasks to do common workflow steps like static analysis, running tests, and performing builds, then you’ll be in good shape. There’s plenty of documentation on how to do this, and Grunt is more manageable than a mess of shell scripts or a giant scripts entry in package.json. Grunt also provides a nice way to share config between tasks, so you can lint / test / build the same set of files, instead of having to duplicate knowledge of your app’s layout.

However, from a task writer’s perspective, it can be more challenging to work with. My team’s first Grunt tasks were full of anti-patterns, but then we refactored them to follow best-practices. These measures greatly improved sanity on an individual task level, and we had a solid set of tasks that worked well. But our dream of having a good way to share them across projects was not yet fulfilled.

Meet the Taskmaster

Grunt is good for working setting up your specific build chain for your single project. But the main thing we were interested in wasn’t setting up a single project — we wanted an SDK, where each consuming project could minimally configure where it couldn’t follow conventions. We didn’t want to duplicate hundreds of lines of grunt config. We wanted something like this:

Different projects were of different types, like a widget or an api client library. Because the project would report its type, we could provide a consistent interface that always “did the right thing” for that project.

You would do a little config to tell us which of a pre-defined set of project types you had, and we wire up our combination of third-party Grunt tasks and our custom tasks for building and running your app. Unfortunately, trying to use Grunt to realize this vision was truly painful.

Async Config

The inability to specify config asynchronously turned into a disaster. Our primary need for async config was specifying temp directories for tasks to operate in. We used tmp, which does not have a synchronous api. The general pattern here was that we would generate a tmp directory and say, “This is the output of Task A and the input of Task B”. Instead of having to wire that up ourselves, I would like the task system to provide an abstraction for pipelining tasks. (Perhaps gulp helps with this?)

Our solution was to create a new task: set-async-config. It would take input that looked like this:

getTmpDir returns a promise.

When you ran grunt set-async-config:build-something, it would iterate over the entries in its options object, and for each key, it would essentially do this:

Some code omitted for brevity.

We iterate through the object, and for each key, we set that path in the grunt config to be the resolved value of the corresponding promise.

This does allow you to have async config in grunt, but at a terrible cost. You can no longer just run tasks — you now need to know about certain other set-async-config tasks that need to be run first. And just looking at the config you’ve written in your js files is no longer the full story, because you don’t know how your other tasks will be modifying it at runtime. (We made a task called config that would print out the config for a certain task. This task should never need to exist.)

Our config for set-async-config was over 130 lines long. An example usage of multiple set-async-config calls in our unit test task definition:

Ugh.

Lazy Loading

An early issue we ran into was the lack of lazy loading. Grunt doesn’t have a way to say, “figure out what tasks I need to run, and then load only them”. If Grunt is designed primarily for a small project with a few tasks, then this is a good simplification. But we had many tasks, and some of them were slow to initialize.

In the above code sample, everything outside myTaskFn gets executed on every grunt invocation. If you have require statements that are slow, you take a hit on every run. We solved this by moving all our requires into the task function itself, but it would be nice to not have this potential stumbling block. Additionally, we don’t want to spend the time to fix this issue in every third party library we use.

I spent some time trying to hack around this by building a lazy-grunt-loading module, but I don’t recommend that anyone use it haha.

No global installation

This point is potentially controversial. Nodejs has a convention that when you run npm install, you have everything you need. We took that to an extreme — when you were developing a widget, your node_modules directory would be 475+mb. Add in post-install building steps from node-sass and others, and the install would take 10 minutes on a MBP with a fast connection, 20 minutes on Jenkins, and an hour for our users in Argentina. (We did a brief investigation and attributed the Argentine problems to a lack of SSD machines and geographic distance from our private npm registry in Virginia.)

The massive size of our dependency tree also exposed all sorts of race conditions and edge cases in npm, which we spent a ton of effort tracking down and reporting to them. They were very nice in addressing our bug reports!

This performance was unacceptable. I really love the idea of npm install giving you everything you need, but this was akin to installing Xcode anew in each iOS project you worked on. We wanted to investigate an approach where dev tools are installed globally. Although this has drawbacks of its own, we thought in our case it would be better. However, with Grunt, this isn’t really an option. The grunt-cli has to be installed globally, but there’s no way to install our task set globally. We hacked around it via npm link, but that is a little more complicated and error prone than we’d like. (A full discussion of the issues we ran in to with that approach would be a digression.)

No task encapsulation

Often times, the tasks the end-user would run would be composed of many smaller tasks. To bring up the example of our unit test task again:

We want users to run grunt unit. We don’t want users to run any of those sub-tasks. Grunt doesn’t provide a good way for us to express this. Much confusion and frustration was caused by people trying to save some time and only run a few of the sub-tasks without fully understanding what was going on and thus missing that special one thing that has to happen first.

The notion of a function that you refer to with a string is an interesting way to compose a pipeline of operations. But that’s a different problem than “what do I want my end-users to be able to interact with?”.

Related to this is Grunt’s behavior when you run a multi-task without a target specified (eg grunt browserify instead of grunt browserify:test). Instead of letting you specify a default task to run in this case, or just throwing an error, Grunt will run all targets of the multi-task. This is almost never what we wanted, and if it’s a task like clean, it can be very frustrating to accidentally delete something. The fact that we have no task encapsulation makes us far more vulnerable to this problem. If we’re using the clean task as a “sub task” of the tasks our users are supposed to run, then we’ve exposed this way for our users to shoot themselves in the foot.

Difficulty overriding look and feel

We wanted things to look a certain way. We had a header that would print out at the beginning of the entire run, and we wanted task headers to look a certain way, etc. Grunt does not provide support for this aside from inviting you to use hooker. I would rather use a tool that allows me to override the defaults in a sane way, rather than monkey patching. For instance, it would be nice if when initializing Grunt, you could pass in functions that would format various types of output.

No way to catch all errors

We tried to integrate NewRelic exception tracking, but it was so hard to find all the different places to listen for errors that we gave up.

Where are we now?

We dealt with some of these problems by hacking and monkey patching grunt. For others, we opened issues with the project maintainers. We heard great things about v0.5, but that’s been vaporware thus far. Unfortunately, development seems to have stalled on grunt entirely:

Screenshot from https://github.com/gruntjs/grunt/graphs/contributors

The most recent release of Grunt, v0.4.5, was released May 12, 2014.

(To be clear, none of this should be taken as a personal attack against the maintainers of Grunt. They provide value to the community for free on their own time, and no one deserves anything from them.)

We didn’t see Grunt going anywhere, and we didn’t see the foundation it presented as compelling enough to justify putting a lot of effort of our own into building on top of it.

We were wary of getting super excited about re-inventing the wheel, just to end up with a half-baked Grunt or gulp clone that didn’t have the community support behind it. However, we realized that Grunt and gulp just aren’t designed for what we are trying to do. (I admit I am not terribly familiar with gulp; I would like to research it and do a follow-up post about it. Feel free to let me know if gulp will work well for this use case!) They optimize for a single project that pulls in some tasks, not for building an entire SDK that you can share between projects. It’s obviously totally fine that that’s out of their scope, but it explains why we’ve had so much pain trying to contort their code to solve our problems.

For v2 of our SDK, we’ve just been developing a set of cli apps from scratch. Instead of using grunt’s framework, we’ve using modules like nconf and bunyan. I am happier with nconf than I have ever been with grunt config. Having the ability to take config from multiple sources with a clearly defined precedence is very nice. bunyan is also great. grunt.log is a pretty coarse tool; outputting logs as json with a fine-grained set of log levels and the ability to attach arbitrary metadata is powerful. (This also solves the problem we had trying to integrate NewRelic with Grunt error handling: we can be sure that all errors are logged, and then just listen to the logs.)

Additionally, we’ve been telling our users to install our cli apps globally where it makes sense. Install times are way down, and we’re not dealing with crazy npm issues any more.

Now that we’re not using Grunt, entire classes of problems, like lazy loading tasks, and hiding tasks from our end users, are no longer present. Obviously, our new approach brings new classes of problems, but they are tractable in a way that the Grunt problems weren’t. With lazy loading, I could only work around it with a hideous hack. With our current problems, there is good, maintainable code was can write to provide solutions. Moreover, there’s now one fewer abstraction layers in our system that soak up developer mental bandwidth and hide bugs.

If you are trying to build a reusable SDK to share across projects, I recommend against using Grunt. It may give you a few quick wins, but in the long term it will introduce enough systemic issues that you’ll spend more time hacking around it than you would setting up equivalent functionality from various other npm modules. A set of small, focused modules that you picked to address your needs turns out to be better than a monolith. Who knew?

CC BY-SA 3.0 by D. Gordon E. Robertson

Thanks to @dylang for reading drafts of this.