There is a lot of talk around testing — who will do it, when it needs to happen, boxes it needs to fit in — yet not enough on the actual testing. This is the first article in the series of looking at software to test, and figuring out how to test it. This article is based on the experiences I’ve had watching people test as I coach and teach them using these examples.

Introducing Gilded Rose

Gilded Rose is a refactoring Kata (practice) created by Emily Bache. The idea with Gilded Rose is simple. There’s an inn somewhere that has an inventory system implemented. They would want it extended for new requirements but that won’t be all straightforward. Don’t break anything that used to work!

Priming With Information Sources and Tools

At work, things don’t come to you with the full range of sources and tools readily handed in. For doing the exercise, I no longer drop people in cold to “just figure it out” but I give them a few starting points.

  • Code. It “works in production” now and you can look at the implementation. If Java isn’t your cup of tea even if I use it while I teach this, Emily has been nice to provide it with tons of other languages.
  • Unit test. The one unit test gives a starting point of how to execute the code so that you don’t need to figure that out.
  • IDE with visual code coverage. Code and unit test in an environment where you can start working on it, with a code coverage tool. I use Eclipse with Code Coverage as I just like the visual nature of it showing green, yellow and red for the branches.
  • Ideas of the domain. You have past experiences of what makes sense for a shop and inventory management system. You have ideas of what inputs are meaningful, and what could be confusing. Everything you’ve learned about what makes software worthwhile is with you.

Getting to Work — How Would I Test this?

We’re approaching the exercise with exploratory testing, and all of our options are open. What makes this exercise particularly exploratory is that I will rule out the option of going away to your cubicle to write test cases based on the specification without running a single test. I expect you to design your tests as you go and allow you to learn rather sooner than later.

Just Try It

You could just forget about the specification for now, as well as not read the code that implements this, and start playing with values you can enter into the CheckItem-method. It takes three inputs:

  • a number of days we sell the item (Integer selling)
  • a number indicating how valuable it is (Integer quality)

Read the Specification

Exploratory testing does not mean you have to jump in without considering any of the sources. It means you are intentional about what you choose, and you combine things in ways that keep you engaged as well as ensure you do a good job tracking coverage and meaningfullness of your work. Reading the specification gives you one way tracking coverage.

Read the Code

You could also choose to read the code. You could choose to introduce some tests that enable you to step the logic through in a debugger so that maybe you could see some patterns in how it is implemented. Maybe you just read it without executing so much. Read line by line, or read paying attention to some aspects like variable names or values the code checks against.

Think about the environment

The program you’re supposed to test is probably intended for some use by some people somewhere. That somewhere most likely isn’t the test machine you’re using now, and the end user interface most likely isn’t going to be the method you have your hands on now.

The First Test

There is no absolute choice for the first test, and having tested this with a good crowd of people both individually and in mobs, some people still make a different choice for the first test in the setup we test in.

The Second Test

With the second test, we arrive at the significant divergence of choices.

  • the specification
  • the code through measurable coverage
  • the context of use

Reading the Spec While Aiming for 100% Branch Coverage

Let’s assume we intentionally, not accidentally, chose to approach this code and code coverage first, with the help of the specification.

  • The shorthand of naming items in the specification in comparison to full names used in the code
  • The fuzziness around limits the rules defined that behavior would change at

Tools Ease The Exploration

To contract to the 18 handpicked tests over 2 hour intensive work, I’ll show you a few minute example of just covering the code with ApprovalTests-library.

Summing up to some fun pitfalls

For a little exercise like this, it has surprising dimensions. To end this article, I wanted to share lessons learned with one newbie tester who found out there was a lot to learn.

  • Any coverage will do, some here some there and testing is done! That is true, but tracking how much done are you is a big part of testing. Even the code coverage can fool you because the number shows line coverage and the colors show branch coverage. Calling it done on line coverage left out lines leading to new branches.
  • Naming coverage dimensions isn’t a thing everyone does, they didn’t. Conceptualizing code, specification, environment and risk coverage is a necessary thinking model.
  • Code coverage can be achieved without proper asserts and gives a very false sense of security. You don’t even need to assert values of quality for the 16 tests above to have them still run to 100% branch coverage.
  • Problems against specification were all assumed distinct problems, no grouping around the fact that there was full areas of “I expect input validation” that were not implemented anywhere.
  • Specification having shorthand like saying “Sulfuras” instead of “Sulfuras, the Hand of Ragnarok” isn’t a major problem with the specification even if it bit them in the exercise. Best of specifications are helpful, yet incomplete. Best of testers using specifications don’t need them to be complete to do complete testing.
  • When results did not match the specification, quickly jumping to conclusion that we are testing a buggy software which wasn’t the case. The models to pinpoint whether problem could be in my tests that I have control over were not in place. We practiced many rounds of “is there another test we could do that would give us a second data point to verify we understood the requirement”.
  • One test one requirement isn’t a thing. They are a messy network of dependencies.

I'm here to figure out what contents people would like on my book on LeanPub:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store