Test Confidence

Kane Baccigalupi
6 min readApr 20, 2015

--

Unlike my last post, this one is on the technical side of tech. My wife who works as a healthcare provider couldn’t even proofread it without yawning. I promise to get back to important business, people and process stuff soon. For now, please enjoy a deeply technical dive into automated testing practices, or politely move along.

I spend most of my time these days helping companies pay down technical debt. The first step is putting some code under reliable tests. It sounds so simple, but having confidence in your tests is the hard part.

Tests are not like mathematical proofs. There is a pressing question, therefore: how much is enough? When I started testing, I had no idea how to answer this question. I overcompensated by testing too much. I have seen other people take the same insecurity to a place of despair. There never seem to be enough tests, so why bother?

Hard concepts can be broken into guidelines. Because people seem to thrive in rules, I am going to share mine for testing object-oriented code.

THE RULES:

  1. Don’t test configuration.
  2. Don’t test libraries or frameworks you depend on.
  3. Only test public methods.
  4. Find a clean boundary in the code and treat everything in the boundary as a black box.
  5. The smallest boundary is a single object.
  6. Test the return value of query calls sent into the black box.
  7. Test the state change expected when a command is called on
    the black box.
  8. Test that commands are sent to collaborators.
  9. Stub queries sent to collaborators as needed, but don’t make assertions.
  10. Don’t mock or stub anything inside the black box.
  11. Prefer more tests where the boundary is tiny.
  12. Use larger boundaries to confirm connections between collaborators.

Boundaries? Collaborators? Mocks?
Commands? #!@$!?

There is so much hidden information in these rules as to make them unusable without some explanation, so, here we go:

Boundaries
The difference between a unit test and an acceptance test is the scope of that test. A unit test focuses on a single object and a single method within that object. An acceptance test will consider an application or even a constellation of applications as the scope of the test. In the middle ground are tests that exercise a series of objects that work together. All these scopes are essential for confidence in a test suite.

Boundary is different than scope. For example, when testing a method on an object, the boundary is the object, not the method. An object is like the atom of the object-oriented world. Trying to meddle with parts smaller than an object leads to nuclear instability.

Methods on the object under test should not be faked. Doing surgery into the object to inject just the right private data is also asking for radioactive fallout.

Tests are more than just support for the code. The tests are the first real use case of the code. Tests yield advice on making the code better. When the code is hard to test it will be hard to use.

Getters and setters do a lot to alleviate the problems of setting up privatized state. Software engineers worry too much about privacy and protecting other people from themselves. If tests are asking that these accessors be born, the code will benefit from them too.

It’s also likely that the object is too complicated if the setup for testing is hard. Needing to fake methods on the object under test is the smell of an object with too many responsibilities.

Collaborators
The best way to deal with an object that has too many responsibilities is to break it up. Collaborators are just other objects.

Depending on the boundary of the test, real collaborators shouldn’t be used. Instead, objects outside the boundary should be fake objects.

There is a lot of rigor around the naming of these kinds of fakes when it comes to testing. Mocking is a word used when assertions are being made about the fake call. Stubbing on the other hand, is setting up a fake response and involves no assertions. Here is a more in depth description of those distinctions.

Queries and Commands

Test capsule from Sandi Metz’s RailsConf 2013 talk

Queries are method calls that don’t change state. They are immutable actions. The query is an interface into the information in a boundary.

Testing a query should be focused on its return value. Also, testing queries is only important within the scope of the boundary. Asserting that a query is called on a collaborator is a waste of testing, because nothing changes as a result of that query.

It is frequently necessary to query collaborators and do the appropriate thing inside an object or boundary. That gets back to stubbing method calls on the collaborator. An object can respond to many different query values from a collaborator, and stubs on fake objects make that simple.

Commands are the complete opposite of a query in that their main purpose is to change state. Because that state change is the most important thing, tests should assert about that change. Command calls outside the boundary all also important and should be tested. This is where mocks come in, with assertion on correct calls to the collaborator.

It is common to see methods that both change state and return query data. Mixing the reporting of state with side-effects is harder to predict and test. When tests feel troublesome near this type of mixed call, separate commands from queries in the code.

These ideas are detailed in a great talk on what to test. It also details why you should not test private methods.

What Not to Test
We all try hard to test only the code we write, not libraries, not configuration. Often though, configurations and libraries come together in ways that make a coder believe there is something important to test. I have fallen into this trap.

I work a lot in Ruby and JavaScript. In Rails, ActiveRecord association setup configures the framework to reflect the database. It is common to see tests aiming for thoroughness that assert the presence of methods that are automatically generated by the configuration. While it may catch a one-time configuration, it isn’t testing application logic. When logic enters that kind of an association via conditions, a test become necessary.

Similarly, I have worked in front-end JavaScript frameworks that take the pain of auto-rendering away. Testing that things are automatically rendered for a boiler plate view tests only the framework and configuration. As soon as logic changes the configuration of the framework, it is time to test.

Good questions for staying on track are: “What am I trying to test?” and the complimentary question, “What am I actually testing?”

Economics

Rails testing pyramid taken from the blog at codeclimate.com

Tests need to be cheap to be maintainable. By cheap, I mean fast to run. Unit tests are cheap. They don’t hit the file system. They don’t hit any databases. They don’t even hit other objects. Unfortunately, it is completely possible to write exhaustive unit tests for every class in a project while still having large gaps in test coverage, hiding bugs.

A test suite also has to exercise the connections between objects. There aren’t great names for these kinds of tests, but I think of them as plumbing tests.

The larger a boundary gets, the less tests are needed to exercise that plumbing. More comprehensive, multi-object tests are expensive. They exercise more code, but also integration tests can’t cover the fractal like multitude of possibilities that happen in a complicated system. Testing very far from the unit-level logic leads to insecurity.

Instead write exhaustive unit tests, and use broader tests to assure everything is collaborating well.

Again, I am saying in brief what has been better explained in a great blog post.

Practice
It is probably obvious from the links, that nothing here is original.

The guidelines, on their own, will frustrate developers who don’t also have the goal of improving code. The rules should be used in a back-and-forth way to enforce better software design. In hard refactors, where both code and tests are weak, it is necessary to break these rules just to get coverage. The test smells lead to code changes, which allow the return to better tests. It’s a good cycle, one that builds confidence.

--

--