In recent years there has been growing perception that unit tests are the single best way to have useful code. Notice that I say useful code, not stable code. Having stable code is a misnomer. What we want is useful code. Useful code both works properly and provides something of value to the user. A blanket code coverage metrics do not give us a clear of a picture of usefulness. Even worse is many organizations have improperly sold it to business owners as a catch all. High code coverage is not a panacea.
Case Study in Separating the Concerns
Let’s start out with a case study before going into the details. I was brought onto a project as the technical lead because it was struggling after its first year of development. The product had a quarterly release cycles. Every release cycle the business group insisted on having 4 weeks to test and validate our app. They did this because bugs in previous releases had made them fearful. The engineering director over the project had a code coverage requirement of 60%, which was getting hit. But the way we quantified the project’s success didn’t match with the frustration everyone was experiencing.
This was obviously a conflation of concerns. A bad application structure which caused inconsistent business logic was the cancer, not a lack of proper unit tests. Having a bit of autonomy, we removed code coverage metric from check-ins. Instead we focused on improving the business logic while keeping to 80% of the feature roadmap for the next release. We completely removed and isolated 100% of the business logic from the app and centralized it into 2 classes. Meanwhile opening conversations with the business group to get all the possible inputs and outputs of this complex business process. The business logic code ended up being 150 lines of code but we had over 200 separate unit tests with the expected inputs and outputs. The same lines of code were being tested multiple times in multiple conditions. If you could have more than 100% code coverage we would have been well above 100%. But since we stopped writing UI tests our overall code coverage dropped to 35%. So how did the next release go?
On the next release cycle the business team had our app for 3 days and couldn’t find a single bug. The following release cycle the business team didn’t even want to check the app because they confident it worked. On that same release we did a massive reflow of the entire app pages and views while still having the fewest bugs of any release. We didn’t fix the project by writing less unit tests but by testing areas that provided the highest reward while ignoring to test code that are less effective. We could have stopped testing our business logic when we got to 100% code coverage but might have missed a subtle combination of inputs/outputs. Harden the code that needs it but don’t test everything.
The Cost of Stability
What do unit tests really do to our code base? When we have unit tests and continuous integration, they add barriers against change on each commit. That’s it. (Code reviews and spec documents speak more to the quality of tests.) Whenever we want to change our code, it costs time to update our unit tests. Unit Tests are very good at stabilizing a code base but they are a double-edged sword by adding a barrier to change both useful and useless code.
“[Unit Tests] are a double-edged sword by adding a barrier to change both useful and useless code.”
Useless code is code that doesn’t provide value to our users. In many engineering groups, both large and small, we are sheltered from the pain of our users. The pain of having a bug is only one type of failure. Unfortunately its the one pain point that we overwhelmingly focus on. There’s also the pain of unresponsive apps and slow page loads. The majority of the web sites and apps are on mobile phones where speeds, even in the US, are still typically 3G or below. 1 2 There’s the pain of confusion. Users that can’t find what they are looking for in our apps and sites. And lastly there is the pain of uselessness. When something that we build doesn’t have any value to our users. As developers we need to work better quantify and attach ourselves to the pain (as well as the joys) of our users.
Unit tests can only test against what we think could be a bug; not what actually impacts our users. They can go a long way to avoid bugs but it’s not a panacea. Real time monitoring is powerful as well. It gives developers a better idea about what actually effects users. Having unit tests and continuous integration without real-time user exception handling, real-time performance monitoring, user studies and user analytics is short sighted strategy that distances developers from the pain that our users actually face.
Balancing Usefulness with Stability
Usefulness is code that both works properly and provides value. But it’s the second part of this criteria is harder to gauge. Even worse is that the value of our code can change of time. Something that users find valuable today might not be valuable in the next release. We need to be open to, and know when to, kill our code when the metrics and UX designers tell us to. To give some guidance between stability and usefulness consider the following.
The utility of some types of code is obvious and will have very little change. If you’re building a bank ledger it has to be bug-free, flat out. Without being perfect it offers no usefulness. Unit test critical business logic code to high hell.
But when should we question if our code will be useful. In my projects the most questionable code is UI code. Think about the entirely of the systems we are building, not just the computer. We are making an interface for people that have a wide variety of understandings about the business logic we’re exposing. If the user doesn’t understand the UI, your code is useless. I’ve been fortune enough to work for R/GA for a number of years. R/GA collects world-class UX designers like Jay Leno collects cars. But even the best UX designers will not get an interface right the first time. It’s a process of iteration. When we create UI code, adding unit tests adds barriers to iteration as well. Slowing iteration, slows the pace to figure out what’s best for our users. Our apps need to work properly but a button being slightly out of place doesn’t have the same impact as 1 + 1 not equaling 2. We need to build interfaces quickly, learn what they do well and poorly and be able to throw them out quickly when they underperform.
“Think about the entirely of the systems we are building, not just the computer.”
Maximizing the Impact of Unit Tests
When TDD & Unit Tests Shine
Where does this leave TDD? TDD is still very important but I think the biggest wins for TDD are not what you might think. TDD, just like unit tests, are a double-edged sword. Some types of project benefit from TDD more than others. But the area where I’m convinced is a massive win is in developer education. Being forced to write unit tests first, forces developers to recognize their dependancies and isolate them. Without proper recognition and isolation of dependencies they will take a long time to mock and stub their data. Forcing developer to feel the pain of dependencies encourages them to write better unit tests and more importantly, write better code.
Along these lines, I highly encourage developers to write unit tests and to learn and internalize where dependancies are in your own code. A better term for dependancies is externalities. Each externality is a potential failure point in your application. I did TDD for over a year on a huge project and learned immensely about externalities from doing it.
I also highly encourage developers to learn functional programming and Clojure/ClojureScript in particular. Reading about ClojureScript or using ClojureScript inspired libraries isn’t enough. You need to write ClojureScript. Without writing ClojureScript, I wouldn’t have learned to internalize the repetition of steps/processes in my own code. I’ve learned that time is our most detrimental externality. Really understanding the intricacy of how time affects my code and minimizing its effect is a massive improvement to the code I write. Since learning to write in ClojureScript, I’ve been writing code the last year I literally could not have written without the convergence of practices in ClojureScript. I’m learning that the steps of your code takes(FP patterns) is just as important as the connections between code(OO Patterns). Learn to understand why and how to isolate time, have less side-effects make your code simpler, reduce externalities/dependencies. Your code will be easier to understand and easier to change.
“Time is our most detrimental externality.”
If you need a place to start learning ClojureScript, check out my post on the ClojureScript data layer in our React app and the resources that it recommends in the appendix (http://bit.ly/triforce-of-power).
Write unit tests. Unit tests are great. But they have a sliding scale in terms of the cost to create and maintain to the value that they provide a project. Don’t conflate a blanket level of code coverage will give the same benefits for everything in your code base. Code coverage is a decent indicator but everything comes with it’s own cost. Selling it as a metric of security to non-programmer groups can lead to a road where engineering is overworking to under produce. Constantly tweak to find the right balance for the type of code you’re writing and look at multiple factors to gauge a project’s success.