TDD is ‘canard.
Tell it to the Duck.
Introducing Quack-Driven Development.
A modern Waterfowl Methodology for programmers.
The purpose of life is partly to have joy. Programmers often feel joy when they can concentrate on the creative side of programming, so Ruby is designed to make programmers happy.
Yukihiro Matsumoto (a.k.a. “Matz”), 2000.
In explaining your code to another person you must explicitly state things that you take for granted when going through it yourself. By verbalising these assumptions, you may gain new insights into the problem.
The Pragmatic Programmer: From Journeyman to Master, 2000.
Is TDD Dead?
I followed the “TDD is Dead” controversy with much interest, because I’ve long struggled with the process, famously summed up by Uncle Bob’s Three Rules of TDD.
- You are not allowed to write any production code unless it is to make a failing unit test pass.
- You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
- You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
That has always seemed overly proscriptive to me, imposing arbitrary constraints that result in unnatural programmer behaviour.
The Bowling Game Kata
Uncle Bob demonstrates his three rules of TDD, and my criticism of them, by showing how to implement the scoring mechanism for ten-pin bowling.
- Test that you can create a bowling game instance (fail).
- Implement an empty bowling game class (pass).
- Test that the instance returns a score of zero for a gutter game (fail).
- Implement a score function that returns zero for everything (pass).
- Test that the score is 20 if a 1 is bowled each turn (fail).
- Change the score function to return the sum of each turn (pass).
- Test that the score is correct when a spare is bowled (fail).
- … (and so on).
That a programmer would actually follow this flip-flopping process as they implement a bowling game scoring program is, to me, very strange indeed.
Would you really write and save a scoring function that returns zero simply to see a test go green? Would doing that really be “driving” the implementation? You know it’s incorrect, right? But you’d still do it? WHY?
We could equally call this approach MDT, or “Mistake-Driven Testing”: deliberately writing incorrect code to “drive” implementation of tests.
It annoys me that these tests encode implicit assumptions about how bowling works, instead of making these assumptions explicit. You know, by turning them into requirements. What a coincidence that the fix to the test that bowling a 1 every frame scores 20 is to change the scoring function to sum up the score for each frame. Just returning 20 if a 1 is bowled in the first frame and a 0 otherwise would be simpler and would also have gotten everything green. How was the creative leap of actually adding things up “driven” by testing? It wasn’t, that was rhetorical goddammit, nothing is being “driven” here. Uncle Bob knew what he was going to do all along, and just decided to write the test first just because. Sigh.
So Why Test?
Even though the way TDD is applied in the Bowling Game Kata is overkill, there are many valid reasons to write tests for your code.
- Documentation. Communicate intent to other programmers.
- Design. Capture requirements and use these to constrain scope.
- Defence. Make it easy for other programmers to do the right thing.
- Debugging. Reproduce issues before fixing the underlying cause.
- Contract. Validate that inputs and outputs for a function are correct.
- Safety. Increase trust when using a dynamic, typeless language.
- Motivation. Flag the little bit of work that you need to do next.
- Focus. Prevent bloated or unnecessarily abstract implementation.
- Guidance. Define a finish line, so you know when you’re done.
- Efficiency. Make best use of time when testing manually.
As with the refactor part of the red-green-refactor cycle, the three rules of TDD don’t explicitly address many of these reasons for writing tests.
I suspect this is where the creativity lays, making TDD a process of “do things mechanically and incorrectly, then fix up the mess”. Too harsh?
It is much better to write a test to reproduce a reported issue than it is to debug in the traditional manner, as you can continue running the test to make sure the bug never returns.
Oftentimes building and running your application is orders of magnitude more complex than running your tests. You don’t want to spend 20 minutes compiling your console game and transferring it to your dev kit only to find it immediately crashes on start when a test would have revealed that problem in much less time.
Interestingly, several of the pro-testing reasons I give above are also good reasons for writing technical documentation. In fact…
Tests as Documentation
…it is often said that tests are the best form of technical documentation.
Documentation gets out of date. Tests cannot get out of date if you keep them green. They also describe the code in terms of other code, which is the perfect form of documentation for programmers.
Keeping documentation separate from code is a Bad Thing, because the two quickly diverge. For that reason, it is often written in the form of comments that may be extracted and reformatted into fancy documentation later.
If tests are the best form of documentation, and if documentation should be commingled with code, then… could tests also be written alongside code? Yes they could, if separated into two parts: Validations and Examples. And that was another rhetorical question.
- Validations are written inline with implementation. They capture requirements, contracts, and programmer intent and belief (where intent is something the programmer writing the code can document while belief is what a programmer trying to understand the behaviour of the code at a later time can document). Validations capture what they would say if explaining the code to a colleague. Or a rubber duck.
- Examples set up the state, invoke the code under test, and check that the state is transformed in the expected way. They are kept external from the implementation.
The advantages of this are twofold. First, the descriptive validations are right there in the code for all developers to see, in the context in which they are of the most use. And second, validations can be executed in production.
Writing good validations is more important than writing good examples, and they should therefore be written first. This distinction isn’t emphasised by advocates of TDD, and projects with high test coverage may still lack the extra safety-net of validations. And even if validations do exist, chances are they are implemented only in test code, and will be absent when running in production.
Introducing Quack-Driven Development
Quack-Driven Development, or QDD, allows the programmer to express requirements, contracts, intent and belief inline with their implementation.
It’s a modern Waterfowl Methodology for software development.
QDD is motivated by the idea of transcribing a rubber ducking session, replacing comments in the code with testable assertions. It has a simple philosophy that informs, rather than constrains, how we do our work.
- Understand the problem. Don’t confuse the solution with the problem. Always be aware of why what you are doing is important. Read “Are Your Lights On” for inspiration.
- Trust your assumptions. Asserting data correctness is work best done at the interface. Don’t clutter the business logic by not trusting the data.
- Respect the reader. You write code for other programmers to read, including yourself one year from now. So write code that you would want to read. Show, don’t tell.
Canard is a not-quite-yet proof-of-concept implementation of QDD in Ruby. It allows programmers to “talk to the duck” in code, using a quack emoticon, and demonstrates how I’d use QDD to complete Uncle Bob’s Bowling Game Kata. Here’s what it looks like.
Start with a Skeleton Implementation
class Game
def initialize
@frames = []
end Q< “takes the number of pins knocked down”
def roll(pins)
Q< “called each time the player rolls a ball”
end Q< “returns the total score for the game”
def score
Q< “only called at the very end of the game”
end
end
- We begin by sketching out our implementation, after thinking hard (which is such an underrated aspect of what we do that I don’t even).
- The four “quacks” are inline with code. They make the four requirements of the Kata explicit and testable. And yes, the Q< is supposed to look like a duck.
Define the Quacks
class Game
Q< :roll do
before “takes the number of pins knocked down” do
expect(pins).to be_a(FixNum)
expect(1..10).to cover(pins)
end
within “called each time the player rolls a ball” do
expect(@frames).to be_a(Array)
expect(1..10).to cover(@frames.length)
end
end Q< :score do
within “only called at the very end of the game” do
expect(@frames.length).to eq(10)
end after “returns the total score for that game” do
expect(retval).to be_a(FixNum)
expect(0..300).to cover(retval)
end
end
end
- Quacks are written in the same class, but may be stored in separate file.
- They can be defined to run before the method being tested, within its body, or after it has returned.
- They are executed in the scope of the method being tested, so local variables defined in the method body are available to within quacks.
- They use rspec-expectations to verify state.
Flesh out the Implementation
class Game
def initialize
@frames = []
end Q< “takes the number of pins knocked down”
def roll(pins)
Q< “called each time the player rolls a ball”
@frames << Frame.new if _start_new_frame? Q< “a frame that needs a roll should exist here”
Q< “if we're on the 10th frame, the frame may be complete” @frames.each do |frame|
frame.bonus(pins) if frame.needs_bonus_roll?
end
@frames.last.roll(pins) if @frames.last.needs_roll?
end Q< “returns the total score for that game”
def score
Q< “only called at the very end of the game”
@frames.map(&:score).reduce(:+)
end private def _start_new_frame?
@frames.size == 0 ||
@frames.size < 10 && !@frames.last.needs_roll?
end
end
class Frame
def initialize
@rolls = [] ; @bonus_rolls = []
end Q< "returns whether this frame is waiting for a roll"
def needs_roll?
!_strike? && @rolls.size < 2
end Q< “takes the number of pins knocked down”
def roll(pins)
Q< “called only if the frame is waiting for a roll”
@rolls << pins
end Q< "returns whether this frame is waiting for a bonus roll"
def needs_bonus?
_strike? && @bonus_rolls.size < 2 ||
_spare? && @bonus_rolls.size < 1
end Q< “takes the number of pins knocked down”
def bonus_roll(pins)
Q< "called only if the frame is waiting for a bonus roll"
@bonus_roll << pins
end Q< "returns the total score for this frame"
def score
Q< "called only when no more rolls are required"
@rolls.reduce(:+) + @bonus_rolls.reduce(0, :+)
end private def _strike?
@rolls.first == 10
end def _spare?
@rolls.reduce(:+) == 10
end
end
- Note that more quacks were added as the implementation evolved; we haven’t shown their definitions here for brevity.
- Choices were made during implementation based on aesthetics. For example, it was important to me that Game#score be calculated by summing the score of each Frame. Not because that makes things generic and reusable, not because it makes things easier to test, and not because I was being “driven” by anything, but simple because it’s how I would explain the purpose of that code.
- Thinking about the implementation this way also revealed that the scoring mechanism of bowling in real life has an ugly hack; the player may bowl the ball up to three times in the tenth frame. Rather than implementing this hack in code (by having the Frame class be aware of whether or not it’s the tenth frame for instance), it became clear that each Frame required two different kinds of rolls; rolls that occur while the frame is the current frame, and bonus rolls that occur in the future, once the frame has been completed. Making this distinction explicit was also an aesthetic choice that served to increase the clarity of implementation.
Write some Examples
describe Game do
let(:game) { Game.new } it “should return 0 when a gutter game is bowled” do
20.times { game.roll(0) }
expect(game.score).to eq(0)
end
it "should return 20 if all ones are bowled" do
20.times { game.roll(1) }
expect(game.score).to eq(20)
end it "should take spares into account" do
game.roll(5)
game.roll(5)
game.roll(3)
17.times { game.roll(0) }
expect(game.score).to eq(16)
end it "should take strikes into account" do
game.roll(10)
game.roll(3)
game.roll(4)
16.times { game.roll(0) }
expect(game.score).to eq(24)
end it "should return 300 if a perfect game is bowled" do
12.times { game.roll(10) }
expect(game.score).to eq(300)
end it “should work with the game given in the example” do
[1,4,4,5,6,4,5,5,10,0,1,7,3,6,4,10,2,8,6].each do |pins|
game.roll(pins)
end
expect(game.score).to eq(133)
end it “should not work if the game is not valid” do
expect(game.score).to quack(“only called at the very...")
expect(game.roll(11)).to quack(“takes the number of...")
expect(game.roll(5.5)).to quack(“takes the number of...")
expect(game.roll(nil)).to quack
expect(30.times { game.roll(0) }).to quack
end
end
- These are just standard rspec tests, but they also exercise the quacks.
- They can be written before, after, or together with the implementation. Whatever floats your boat. As long as they get written.
- The first five tests mirror those in the original Bowling Game Kata.
- We also test the example that was given in the Kata requirements, and finally that the quacks detect invalid usage.
Quack in Production
It is silly to throw away all this hard work once we ship. Wouldn’t it be better to continue to test our code live, as users interact with our software?
Yes, it would (rhetorical). Quacks can be configured to behave in different ways to make this an easier decision to make.
- Exception: Throw an exception. This is the default.
- Logger: Log an error but forge bravely onward.
- Debugger: Break into the debugger.
- Ignore: Do nothing at all.
Quacks are also easy to comment out in a scriptable, recoverable way, easily turning them into dead ducks, like this.
class Game
#< “takes the number of pins knocked down”
def roll(pins)
#< “called each time the player rolls a ball”
end #< “returns the total score for the game”
def score
#< “only called at the very end of the game”
end
end
Notice that these “dead ducks” serve as comments.
Are the comments in your code dead ducks waiting to be reanimated through quack-driven development?
Summary
I don’t think TDD is dead. But I do think that writing tests to “drive” implementation is one of many tools that should be used sparingly.
- It doesn’t work when I need to prototype iteratively, because the requirements aren’t given up-front, and the design evolves as I go.
- If requirements and design exist, it’s better to begin with a skeleton implementation that expresses those things before writing tests.
A great alternative to TDD is QDD, which separates tests into validations that are written inline with implementation, and examples, which are for establishing an initial state and then testing that the implementation transforms this state correctly.
The Three Rules of QDD
- Write a skeleton implementation with quacks that capture requirements, interface contracts and programmer intents and beliefs.
- Write tests that exercise the quacks and cover the design, and write implementation to keep those tests green. In whichever order feels right. Write tests to increase your confidence. Write tests to unblock your creativity. Think hard, and write implementation when you have attained understanding.
- Run the quacks live in production, letting your customers provide the examples rather than your contrived specs!
Advantages of QDD
QDD has two major advantages over TDD: it keeps knowledge about requirements and programmer intent right in the implementation, and it continues to test your code in production.
As a bonus, the Q< emoticon is cute.
I recommend you add QDD to your waterfowl process today; it perfectly complements duck typing, duck punching and rubber ducking!
Q< “Testing shall never be ‘canard again!”