RuboCop in legacy projects, part 2: Focus on the present
Applying your style guide selectively is the way to stay sane, but takes a little work
TL;DR Relax about the state of your old code (for now); work out how to make your new code (including changes) conform to your style first.
In my previous blog post in this series, I took a look at setting up RuboCop for use in a long-term legacy Rails project, including how to best utilise the ability to autogenerate a TODO file.
In short, contrary to what the
--auto-gen-config option does for us, we shouldn’t inherit from the generated
.rubocop_todo.yml file. That leaves our main configuration file,
.rubocop.yml a lot cleaner. To repeat from that last post:
.rubocop.ymlis a definitive statement of how we want all our future code — whether brand new, or rewrites of older code — to be written.
But if we have a clean, definitive config file, running
rubocop on the command line will most likely throw out a vast number of violations to be addressed.
The sheer quantity of those can feel overwhelming. I think that’s possibly the major emotional reason why people like the ‘traditional’ way of using
.rubocop_todo.yml to get to an ‘all green’ state, to falsely assuage the guilt of seeing a huge list of things to change. “Never mind,” we tell ourselves, “we’ll put all those changes into this extra file, and we can hack away at them as we go.”
Except we rarely do. When stuff gets swept under the carpet, we have better things to do with our time than to life the carpet back up and sweep everything out again.
Back in the last blog post, I suggested not using
.rubocop_todo.yml in this way, and keeping our main config file as:
a definitive statement of how we want all our future code — whether brand new, or rewrites of older code — to be written.
But that means that running
rubocop from the command line, whether on the whole suite or a single file, is likely to keep overwhelming us with violations from old code.
The key is working out how to filter out all that noise.
Be at peace with your old code base
The most important thing about improving your legacy code base is to not fret about it.
All being well, you’re working with a project which is in production, delivering value to customers ‘in the wild’. That is the most important thing.
Of course, having a codebase which is easy to read makes it easy to change. That is going to make it quicker to introduce new features, which increases the value to customers. That’s where having clean code that is easy to navigate helps, and RuboCop can help with that.
But really, if you’ve got code which is running and working, you shouldn’t feel guilty about leaving it alone.
Instead, we should focus our energy on the changes we make to our code, and making sure that those changes are written in accordance with our desired style.
To that end, you should find a means to limit reporting of violations to the lines of code you’re changing.
Now, how exactly you do that will depend on your own workflow. You could do this manually, by only running
rubocop on your files as you change them, and mentally ignoring violation notices from lines you haven’t touched.
But there are several utilities that can help cut out any noise that those extra violations would create, allowing you to focus on writing clean code.
Pronto is a general purpose library that allows various code analysis tools to run against changes in a git branch. Originally designed to be used against GitHub pull requests, you can also run it locally.
Pronto comes with a number of ‘runners’ — extensions which will run various scenarios against only your changed lines. In most cases all you need to do for each runner is install it, and Pronto will pick it up. For RuboCop, that means installing the
$ gem install pronto pronto-rubocop
However, you do need to make sure that
pronto-rubocop will be able to find the correct version of
rubocop, and any extensions (e.g.,
rubocop-rspec). You can add pronto as part of your Gemfile (in which case you can take the opportunity to be specific about versioning):
# Gemfilegem 'pronto', '~> 0.10.0', require: false
gem 'pronto-rubocop', '~> 0.10.0', require: false
Once it’s installed, you can check your new code before it’s staged for committing:
$ [bundle exec] pronto run --unstaged
Or, once it’s staged, you can check before committing:
$ [bundle exec] pronto run --staged
You can also compare to a previous commit, which is useful if you’re checking the state of a whole branch before raising a pull request.
The RuboCop output pronto emits is a little different to what you may be used to. For example:
app/models/catalogue_upload.rb:21 W: Style/HashSyntax: Use the new Ruby 1.9 hash syntax.
This is very similar to RuboCop’s own ‘emacs’ formatter (available with
rubocop -fe). It’s possible that some editor plugins could use this to help navigate to code violations, or indicate them inline — that’s not something I’ve implemented myself.
Pronto’s biggest strength is its versatility, especially as the basis of a CI-based checking tool. The README is full of integration examples and links to other articles that provide many examples of the uses to which it has been put.
Another tool — and one I’ve come to rely upon day to day — is Overcommit.
Overcommit is designed to allow you to configure git hooks from a single configuration file. Checks such as RuboCop can be set up to run as pre-commit tasks — and if it or any other pre-commit tasks fail, overcommit will halt so that you can correct the failures and try again.
Overcommit installs as a gem, which then provides a per-repo installation routine:
$ gem install overcommit
$ overcommit --install
The installation process creates a file in your project root called
.overcommit.yml, which initially contains nothing but comments, which you can replace with your own configuration.
Like RuboCop, Overcommit has its own default configuration, so you only need to add your own changes to
.overcommit.yml. For example:
command: ['bundle', 'exec', 'rubocop']
(Alternatively, if you’re already using
pronto as described above, you can configure Overcommit to run that task in your pre-commit hook).
By default, overcommit’s RuboCop support will warn you about errors in lines you haven’t touched. In keeping with our mantra of being at peace with our old code base, I turn that option off with the
problem_on_unmodified_line: ignore directive.
Occasionally, you may want to commit a file that doesn’t pass all the checks you have set up. This happens to me most often if I’m in the middle of something and want to check in a work-in-progress commit that I’m going to come back to. In that case, you can skip individual tests on a case by case basis:
$ SKIP=RuboCop git commit -m "WIP: Halfway through fixing user auth"
IN the couple of years I’ve been using overcommit, I’ve really begun to appreciate its power. Its selective application of tests is great — I can have a pre-commit hook that runs
yarn audit whenever
yarn.lock files change, for example, and Overcommit will just skip that check if the files haven’t changed.
There are a couple of things to watch when using Overcommit. It works its selective file magic at pre-commit stage by utilising the
git stash stack. Most of the time this won’t matter to you, but if you Ctrl-C out of its process at an inopportune moment, or a custom hook crashes for some reason, it may not restore your staged files correctly. If that happens,
git stash pop will usually fix it.
The other thing to bear in mind is that overcommit works best when you work with git via the command line. Because of its (occasionally verbose) output and the possibility that it may abort your attempts to commit, if you use your IDE or another GUI tool to work with Git, that might cause issues for you.
The previous two tools work by running RuboCop across entire files, and then discarding results which aren’t relevant to the files you’re actually working on. That’s because a single file of Ruby code is the smallest unit RuboCop can actually work with.
But what if RuboCop itself could work line by line?
That’s what the extension RubocopLineup tries to do. This effectively moves that discard operation into RuboCop itself, so that the core RuboCop code no longer emits warnings about any lines other than those that have changed.
It does this by monkey patching — swapping out an existing method of Ruby code within RuboCop’s internals for a different one. That’s perfectly valid Ruby — indeed, that ability is what appeals about the language to many a developer. There is always a risk, though, that future versions of RuboCop may change their internals in such a way that the code breaks Indeed, Lineup is inspired by an older Gist which attempted the same thing with an older version of RuboCop, but which will no longer work.
It’s also fairly limited in the comparisons it can make, comparing your code only to
master. For me, that was a deal-breaker: the main project I’ve been working on for the past year has a
development branch against which all new work is taken, so comparing with
master wouldn’t be an accurate portrayal of my code changes.
One way that Lineup might help is its ability to optionally include the plugin by calling
rubocop -r rubocop_lineup. That could potentially allow you to tweak RuboCop integration with an IDE so that it only flags errors as you change lines.
The first step in improving the code quality of an active, legacy code base is to focus on the new code we are writing. Tools like pronto, overcommit and rubocop_lineup can help us do that by filtering out violations on old code.
But just because we are now at peace with our old code base, that doesn’t mean we want to leave it alone for ever. Next time, I’ll look at how to manage larger scale cleanups sensibly.