Ruby is a productive and fun to work with environment with lots of useful tools and products for managing software quality. Today, I am excited to share one of my recent ideas in that space: Undercover, a gem that stops us developers from shipping untested code in CI and in dev: https://github.com/grodowski/undercover.
Undercover inspects changed files and creates warnings for blocks of code which have not been tested, similar to a linter. It’s a simple idea I found useful in a large codebase along with existing coverage reporting tools and I hope you’ll see the value as well!
Testing large apps
Dealing with large inherited (legacy?) codebases carries a risk of breakage. That’s why we write tests in the first place. Every programmer is probably guilty of a bug or service outage caused by that tiny change, because it is so easy to miss a side-effect or edge case especially when we are new to a project.
The test suite in place may be subpar. There is no guarantee that the assumptions it makes are correct. Worst case it might not exist or be incomplete.
I have worked with such projects and used to be an advocate of always aiming for full test coverage. However, reality verified that quickly. Testing code created months or years ago by another person (who might have already left the company 😱) results in brittle tests that usually fail to capture the complexity or all edge scenarios. In other cases, it ends up becoming an endless effort of reverse engineering actual production code that works just fine for its purpose.
So is 100% test coverage actually a goal worth hitting? No, it’s usually not! I’d like to propose a different approach to measuring and improving coverage for large apps, so that we can live peacefully with legacy code and spend more time working on meaningful features.
Write tests at the right moment
Undercover is a code review utility that takes the ideas from code quality tools like RuboCop or Pronto and applies them to coverage. If 100% test coverage is not what we want to invest our time in, how about making sure that just the code in scope of our changes is tested well?
Undercover does that and triggers warnings on untested parts of our code based on a changeset from git and a set of rules. Take a look at a sample output below, where undercover checks a recent change in undercover 🙀.
Those warnings are a complementary input to code coverage as percentage or percentage delta in case of pull requests. Why? Firstly because timing matters: Undercover’s goal is to turn code review comments like this…
…into automated feedback that should be addressed before actual code review (involving human) starts. It doesn’t care whether you write tests before or after the implementation, but makes sure they are in place before code is merged.
Additionally, it is less prone to false green results. A false green result is a message like “your code coverage has improved by 1%”, when it’s sometimes enough to add some new well-tested code to skew the result and sneak in some untested methods. This approach is also insensitive to code deletions, which may cause fluctuations in code coverage measured as percentage.
How it’s made
Undercover fuses data from a git repository (rugged), a coverage report (simplecov) and a parsed representation of your Ruby code.
Each line taken from a git diff may trigger a warning if it has been added or modified. Then, the selected candidate source lines are matched against the latest coverage report to check if they have been hit at least once in tests. Finally, warnings are generated for respective code blocks (methods, classes, modules and blocks) that encompass untested source lines.
There’s more to come 🏗
Undercover is still an experiment in its early days. However, we are using it in the CI/CD pipeline at Rainforest QA and it has already given some useful feedback on our pull requests! Now it’s time to explore where to take it next and these are some ideas I would like to work on.
Better feedback loop
A good workflow integration is a must have and can be achieved through a Pronto plugin as well as a Code Climate engine. Both will have the capability to provide users with faster feedback from CI and are on my list. The pronto runner is already in the works!
Undercover warnings are currently derived from changed source lines. But what if the line below is still untested and is waiting to cause an error? Or even worse, what if the total coverage of the file I just modified is really poor? My idea for that is called “Run Modes”. They expand the range of candidate source lines to all adequate locations to minimise the error inherent in coverage measurements. Two new modes would report on changed code blocks or changed files instead of just source lines, hence fostering proactive coverage improvements for methods/blocks/files we touch in our changes.
Coverage data quality
Branch coverage is a new addition to Ruby available in 2.5+ or through the DeepCover gem, while the covered gem can provide coverage warnings for evaluated code like view templates. Incorporating them into the gem would help detect brittle tests and reduce false greens. You might also want to check out this Ruby core issue with an interesting discussion on where branch coverage is heading.
I’d love to hear your thoughts about Undercover’s approach to measuring code coverage. Does it fit into your current testing strategy? If you like the idea, go visit undercover on GitHub, press the star button ⭐️ and try it out for your Ruby project. Happy testing!