RuboCop in legacy projects, part 1: TODOs and TODON’Ts

Scott Matthewman
Oct 2 · 6 min read

Ruby and Rails developers: adding style linters to your legacy projects is harder than it should be. Let’s do something about that.

This is the first of a series of posts I’m writing about implementing RuboCop in legacy Ruby and Rails projects, and which I am building into a wider talk to be given at a later date.

I’ve talked before about how having a consistent style guide is essential for any team of developers, and how one can set up RuboCop to apply that style guide across multiple projects managed by the same team.

But what if you have a project which has been worked on for a number of years without any style guide at all? In a typical Rails project, that could mean several thousand lines of code, each with radically different stylistic approaches depending upon which developer was working on it at the time.

RuboCop has some tools which can help you. But if you apply them without knowing what you are doing, you might make things worse for yourself.

The current way of thinking is wrong

There is advice out there for starting with RuboCop. The gem’s command-line executable has options to help. It’s advice I’ve used myself, and even amplified by sharing, including in my 2016 piece.

But, after years working with Ruby and Rails projects, I’ve come to the conclusion that the advice out there is wrong.

There are ways of thinking about our relationship with RuboCop that have long term benefits to our code, to the quality of our git histories, and to our growth as developers. The trouble is that the guidance that’s currently out there risks setting us down the wrong path.

Take this recent post from Prathamesh Sonpatki’s excellent blog as just example of many. It looks at how the team at Memory.ai added RuboCop into their workflow, and is an articulate, clearly laid out look at the approach that the Ruby community has taken for granted over the years:

For existing projects, there is a better way to integrate RuboCop. RuboCop provides a file .rubocop_todo.yml which records all the offenses from our codebase. We can fix these offenses one by one as we touch that particular piece of code. So we don't have to fix everything to start with.

RuboCop’s TODO file is generated by the command line option auto-gen-config:

rubocop --auto-gen-config

This action runs rubocop across your entire code base, then constructs a set of rule exceptions that effectively ‘ignore’ code that doesn’t fit into the existing guidelines. This file is then included at the top of your core rules file, .rubocop.yml.

The next time you run rubocop, you’ll get a green output. And ta-da! All is well. You can continue coding, and when time allows you can look at the contents of .rubocop_todo.yml and decide how you want to address each set of violations.

The problem with the TODO approach

The problem I have with this approach is that it mistakes ‘getting a row of green dots when running RuboCop’ with ‘having a healthy coding style’.

And my issues stem from the inclusion of .rubocop_todo.yml at the top of .rubocop.yml, the file which is supposed to be your definitive list of rules.

It’s one thing to generate a report of code that doesn’t meet the standards that RuboCop expects. That usage of the TODO file is great, and can be helpful. But by including it within your main rules file, you are effectively changing what you consider to be good coding practice within your codebase.

You get that nice, green output because RuboCop selectively turns off cops for some files, and adjusts metrics that affect the entire codebase.

For most cops, the generated TODO file generates a file-by-file list of exemptions. For example, from a real world example:

# Offense count: 6
# Cop supports --auto-correct.
Style/EachWithObject:
Exclude:
- 'app/forms/preferences_form.rb'
- 'app/models/concerns/form_extensions/serializer.rb'
- 'app/presenters/season_offers_view.rb'
- 'app/services/docusign_webhook_service.rb'
- 'app/services/negotiations_builder.rb'
- 'app/services/offer_creation_service.rb'

In this example, RuboCop found six files where inject or reduce were used in a place where each_with_object could be more efficient and/or make more readable code. By including the TODO file in our main .rubocop.yml file, we are saying “use each_with_object where possible, except in these six files.”

Maybe you think that’s OK. Certainly, I’d agree that it’s not particularly damaging to the rest of the application. You could (and should) argue, though, that anybody who needed to edit any of those six files ought to have their code subject to the same code rules as any other file in the application.

More egregious are those cops which are disabled because too many files violated them:

# Offense count: 373
# Cop supports --auto-correct.
# Configuration parameters: EnforcedStyle, IndentationWidth.
# SupportedStyles: with_first_argument, with_fixed_indentation
Layout/AlignArguments:
Enabled: false

If you import the TODO file blindly, then no file, now or in the future, will get checked against this cop at all.

Similarly, RuboCop’s metrics measurements will be adjusted upwards so that every single line, method, class and module passes:

# Offense count: 30
# Configuration parameters: CountComments.
Metrics/ClassLength:
Max: 535

# Offense count: 16
Metrics/CyclomaticComplexity:
Max: 17

# Offense count: 816
# Cop supports --auto-correct.
# Configuration parameters: AutoCorrect, AllowHeredoc...
# URISchemes: http, https
Metrics/LineLength:
Max: 1051

# Offense count: 60
# Configuration parameters: CountComments, ExcludedMethods.
Metrics/MethodLength:
Max: 533

# Offense count: 1
# Configuration parameters: CountComments.
Metrics/ModuleLength:
Max: 119

Again, this snippet is generated from a real world, production codebase. And yes, somewhere in that codebase is a line which is 1051 characters long.

And if that TODO file were to get checked in as part of your RuboCop configuration, how would you be able to set yourself any practical limits on new code that you write?

That’s the problem I have with the whole “run --auto-gen-config to get a green pass” approach to RuboCop. It’s effectively saying, “let’s write our styleguide to match the mess we have right now.” Which risks allowing us to get away with writing more mess in the future.

Instead, I want us to think of .rubocop.yml as being a definitive statement of how we want all our future code — whether brand new, or rewrites of older code — to be written.

And that means never, ever including .rubocop_todo.yml.

Sure, we won’t have a nice green output when we run .rubocop.yml. Instead, we’ll have an honest appraisal of where we are with our code.

Knowing how much of our code doesn’t meet the standards we want is more important than telling ourselves everything’s okay.

That doesn’t mean the --auto-gen-config command line argument doesn’t have merit, but it does need a little more careful handling.

A saner TODO file

Having a TODO file of code that violates our clearly defined set of rules is a definite boon when it comes to looking at where, and how, to address the discrepancy between the code style we have and the code style we want.

Running rubocop --auto-gen-config with a couple of other command line options can provide consistency to the report. And, assuming that our codebase is in a git repo, we can also prevent it being mixed in to our genuine set of rules:

$ rubocop --auto-gen-config \
--auto-gen-only-exclude \
--exclude-limit 2000 \
&& git restore .rubocop.yml

In the command line above, --exclude-limit determines how many files must violate a particular cop before it gets turned off. By choosing an arbitrary number larger than the total number of files being examined, we’ll always get a list of files that need attention rather than just disabling a cop.

Similarly, --auto-gen-only-exclude is designed to prevent those pesky Metrics cops from defining Max: values that apply to your whole code base. Instead, their output will match the other cops’, listing those files which don’t meet either RuboCop’s defaults or any values you’ve set yourself in your master .rubocop.yml file.

Finally, git restore (introduced in git v2.23; use git checkout -- .rubocop.yml if you’re still using an earlier version) undoes the insertion of the TODO file into your main rubocop file.

This means you can still have your list of files and cops that need to be reconciled with each other. But now, any new file you add will be guaranteed to be checked against a clean, clear, agreed set of rules.

Don’t forget the old stuff

Of course, there is still the matter of what to do with existing files. When should we go in and tidy them up? Just as importantly, when shouldn’t we? And what if we need to edit an existing file — should we be expected to make the whole thing comply with our new rules, or just the lines we’re modifying?

Questions like those are what I intend to cover in my next blog post on this topic.

Scott Matthewman

Written by

Scott is a software developer during the day and a theatre critic & director of an evening. Which is the worst superhero identity ever.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade