App developers: Are your unit tests paying their way?

Abstract: TDD (test driven development) and unit tests have become a silver bullet for quality in mobile software development. Unfortunately there is so such thing as a silver bullet. TDD and unit tests have their place, but they are neither comprehensive nor efficient tools to ensure quality for user facing apps. A healthy combination of code reviewing, assertions, upfront design and automated UI testing is suggested as a better alternative.

The perceived consensus about unit testing and TDD on the internet these days goes something like this:

Testing is good. People should write tests. They should write them before writing the actual code. And write them for every line of code they write and strive for 100% coverage. If you don’t (or at least try, or have a good excuse not to) do this, you are not a good developer.

Businesses have started to pick up the hype, and in job specs and interviews, many companies now see TDD and unit tests as a litmus test for good developers, even mobile.

In 2014, Jim Coplien published a widely discussed article titled Why Most Unit Testing is Waste. As I read it I found myself agreeing a lot. His key takeaways are:

  • tests need to add business value to justify the time spent on them, i.e. they should guard against regressions at the user functionality level
  • system level tests are far more important than unit tests
  • unit tests are only useful for algorithms that can be traced to an external specification of correctness
  • software architecture that is optimised for unit tests is harder to understand
  • developers tend to overestimate the usefulness of unit tests, giving them a false confidence in their code
  • a lot of tests are tautological; they test trivial code such as getters and always succed; these tests have a negative business value
  • even 100% code coverage does not guarantee correctness
  • developers become too focussed on making tests pass that they lose sight of the big picture (user stories, overall integrity)
  • the overemphasis on TDD is the result of the so called fail-fast culture, where programmers thanks to fast hardware never had to learn how to think before they write code

The reaction on HN and reddit was overwhelmingly negative — people were mostly refusing to entertain the possibility than Coplien had a point.

Like Coplien, I see no point in writing unit tests unless it’s against a non trivial algorithm with a clear spec or rules that define its results. But not for code that draws pixels on a screen, transforms data or does networking or disk I/O. Not even for code that does these things through indirection. Why not? Simply because I found that writing unit tests for such code does not pay off in terms of time spent. It is both counterproductive and counterintuitive. Or have you ever tried to create a user interface using TDD, as in write tests first? There are plenty of examples of people trying to unit test view controllers, but non of them are ever test first. That’s because the only proof that a view controller works is that it draws the pixels on the screen in the right way, and that cannot be unit tested.

The fact that a thousand unit tests are green says nothing about wether or not an app works as expected. That’s because the scope of unit tests is too small. Unit tests can’t guard against typical bugs like:

  • unclear or changed specifications (user experience, web services)
  • broken bindings between view and model code
  • misspelt string literals
  • layout issues
  • animation glitches
  • UI elements not being refreshed after model changes
  • not handling failed network requests
  • implicit sharing through global variables or shared mutable objects
  • out of bounds access errors in arrays

Another weakness is that unit test criteria (i.e. conditions for passing) can be pulled out of thin air by the developer and may not be connected to any actual functionality. Yes, your view model passes its unit tests, but you misunderstood the requirements and now you have to change the tests and the implemetation. BDD (behaviour driven design) tries to address this shortcoming by introducing domain specific languages. From my experience, BDD frameworks such Specta or Quick make tests a little more readable, but don’t guarante that tests have business value. This can only be achieved by reviewing tests with a stake holder, business analyst, or someone elsewho understands the user requirements.

It’s not that I’ve never worked with TDD or I don’t understand why people so passionately defend it. I know how rewarding it can be to turn tests from red to green, turning coding into a fun game. But I also remember that sobering moment when my code still had lots of bugs because I had misunderstood user requirements, and then had to change both my unit tests and production code.

Proponents will point to a list of secondary benefits they attribute to TDD. I’m not saying that all of the following statements are wrong, but for my personal style of writing code I have either not experienced the benefits below or found better alernatives:

it provides early feedback and eliminates bugs early that would take long to debug in a system test

Call me old fashioned, but I found that intensivly thinking about code (possible errors, side effects, simplifications, effects on architecture) before and while writing, until I understand what every single line of code does and sometimes taking a step back and draw some diagrams can reduce the number of bugs tremendously. One way to do this is code reviewing, or even better: Pair programming.

I also found that assertions are an invaluable tool for debugging (Coplien recommends replacing tests with assertions whenever possible). Use them to state your assumptions wherever you can. They will catch errors before they can cause other errors and become hard to debug. In comparison to unit tests, assertions are light weight and dead easy to maintain, as they are right in the production code and not in a separate file. For that reason they also serve better as documentation than unit tests.

it’s easier to mock out a single class and test it thorougly than run a UI test of comparable granularity

That is true, but code that has no business logic (UI, network, disk I/O) rarely needs more granular testing that you can achieve with UI testing, or at least none that pays off in terms of time spent.

it makes refactoring easier because it catches regressions

This is only true if the scope of your refactoring is small enough for the interface of a function or class not to change. For any larger scale refactoring, say, changing a function’s parameters, adding and removing functions, or even modifying the architecture itself, your unit tests actually make it harder to refactor: You need to rewrite them each time you do this. I don’t know about you, but I find myself changing, adding and removing functions and classes quite a lot, and unit tests would slow me down greatly.

it improves software architecture because you have to think about how a function looks like from the caller’s perspective before writing the implementation

You should not need unit tests to think about that. An alternative to this is the following workflow: When adding a function or class, write it as a stub first (as you would for TDD). Then, call those stubs as you intend from the call sites, just to see if your API is convenient to use. If it isn’t, change your API and experiment until it feels right. Then, try to write the function body. Again, if you feel it isn’t quite right (e.g. parameters have the wrong type, are missing, or redundent), change your API. Repeat until both the interface and the implemenation feel right.

it serves as documentation by doubling as sample code

Good sample code is stripped down to essentials and structured so that it guides the reader step by step, neither of which is true of test code with all its edge cases and other noise.

I realize system tests (i.e. automated UI testing) are hard to write for mobile apps, and I have to admit that my experience with it is limited. There are a lot of difficulties associated with automated UI tests: They are slow, unreliable, and it is tricky to mock an app’s physical environment (disk, network, time etc.) to create test scenarios. But the fact remains that only UI tests can verify that an app actually works.

The point I’m trying to make is:

  • not all code is suitable for unit testing
  • use unit test where it makes sense: Abstract, isolated logic with a large number of possible results that doesn’t do any I/O
  • don’t waste time testing trivial code (getters, initialisers, etc.)
  • don’t fall for the red-green fever: Question the value of each test before you write it

Instead, always try to write the most simple code you can and understand every single line of it. Get a second pair of eyes to look at it. And apply the same scrutiny to test code.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.