What makes a good unit test?

Mauricio Chirino
bitso.engineering
Published in
4 min readAug 3, 2022
Looking for quality over quantity

“When a measure becomes a target, it ceases to be a good measure.” — Goodhart’s Law

Let’s skip the part where I need to convinced you that writing unit tests are good for your health. The more unit tests, the greater the confidence to ship code faster. However, when is it too much? when are tests not adding value?

Unit tests are a funny thing when we try to measure their effectiveness. Let me explain: no coverage in a project is a red flag. The opposite isn’t a true statement though, 100% code coverage does NOT guarantee proper software functioning.

How do we classify good vs bad tests then?

Since 100% coverage does not proof a bug free system, then we should strive to write useful tests then. Easier said than done, right? Let’s classify them

  • Hard to implement, low value: these are typically the tests we start writing when dealing with a legacy code base. They are hard to implement usually due to the amount of dependencies needed to bootstrap the system under test (SUT).

It’s common to see tests as the above whenever we’re dealing with legacy code. There’s too much setup involved just to get the SUT initialized, increasing its fragility; any non related change that alters any of the above interfaces will result in the test suite being broken at this point without actually catching any bug. In short: it’s not refactoring-resistant 👎🏽

  • Easy to implement, low value: these are the tests that increase coverage % without actually improving the bug catching net. Examples of this are tests written to cover impossible states within a codebase. A cleaner architecture approach will render them obsolete (getting rid of optional values when they are actually mandatory for a given part of the system to work). The example below exemplifies another kind of these
Poorly used snapshot tests

Snapshots should be used to assert layout/design correctness, no business rules state changes. They may increase the coverage % a whole lot without actually improving the signal. Imagine for instance that DNIVerificationViewController changes its layout (maybe increase the input area to better show the numbers typed), in such scenario we’ll have 8 failing asserts that have nothing to do with our latest change. Poor signal:noise ratio.

  • Hard to implement, high value: integration tests are the book case example for this category. Even though they demand a high dose of brain power to be created, the signal:noise ratio tend to favor them.
Integration test

Even though there’s a lot of setup involved here, the surface area these kinds of tests cover is huge. It is harder to introduce regression whenever several of these are scattered as failsafe across our test suite since their purpose is to guarantee proper interaction among multiple actors.

  • Easy to implement, high value: these are our typical granular unit test found on well defined use cases. i.e: a login button should not be enabled when either the username or the password are missing. A unit test for the previous use case shall be fairly simple to implement and on failure, the signal will be clear 👉🏽 this contract was violated.
Well defined use cases tend to produce code that’s easy to test

Not only does the above example looks dead simple to read, it yields clear signals whenever it fails. Whenever a test fail, it should clearly state 3 points:what failed, where and WHY?

How to proceed

Test quadrant classification.

All of it boils down to a very subjective context. Keep in mind that within a large project exist multiple code level of maturity at the same time. There will be legacy places that will require Hard to implement, low value just to get a sense of how that part of the code behaves under certain inputs. Ideally we shouldn’t rely on them as our code matures because we ought to make incremental improvements toward simplicity, maintainability and readability.

Conclusion

At the end of the day, we as software engineers don’t have to think in writing tests as a burden or some extra loop we have to jump in order to call it the day. They are the custom safe nets we layout to reliably ship code faster, protect ourselves again regressions and spot fast what, where and why something went wrong.

Happy testing folks 🤓🧑🏽‍💻

--

--

Mauricio Chirino
bitso.engineering

Sr. mobile engineer and fitness rookie. I do what I love and love what I do (thanks God people pay me for it)