Beginner’s guide to automated testing

A few years ago, I was lucky enough to cross the way of a bunch of guys which were very excited about putting the Agile mindset in action. I didn’t know the first thing about testing, and I felt bad about that. Knowing how little I knew and practiced testing made me feel a very incomplete IT professional.

Ever since, I’ve learned a few things along the way, and have developed my own small set of ideas about how to write automated tests, and how to make the best use of them in many aspects. I got to understand that tests sometimes work not only at the machine level, but also at the psychological level, making programmers feel safer to do their jobs.

So, I’d like to use this post to share some ideas of my own about automated tests. I don’t mean to say I’m a test master at all. Many people have way more knowledge than me in this matter, I am probably wrong in many (if not all) of my ideas, most of them are probably quite simple and obvious, and I fully recognise there still is a long way ahead of me.

Also, automated testing is a really broad subject. There are whole conferences where people from all around the world gather to talk about different aspects of this discipline: distributed testing, automated acceptance testing, mobile testing, and so on, so forth. I have no intention to cover everything here. I’ll mostly focus on the smallest and most basic kind of testing: automated unit testing. Most of the concepts presented in the next paragraphs can be adapted and expanded to other situations, scenarios and kinds of testing, but I don’t want to bore you with my extensive lack of knowledge, at least not too much.

Nonetheless I’d like to share what I’ve learned so far with you, dear reader. I hope you enjoy it. If you happen to have any complaints, questions, comments or suggestions, please feel free to add comments to this post.

So, the very first thing is:

Automated tests must have control over their dependencies

The whole idea of a test of any kind (automated, manual, hand written on a piece of paper) is verifying whether the outcome of a piece code matches your expectations.

At a unit level, to achieve that, your test must have control of all variables, to make sure your code will do what is supposed to do under the foreseen circumstances. If you have no control over what surrounds your code during a test, then your code becomes unpredictable and your test becomes worthless.

So, generally speaking, if you’re testing a method that depends on, or delegates execution to another class or component, you should make sure this other component will behave as expected. There are several ways of doing it. The most usual is to create a mock object that fakes that dependency, and reacts in whatever way you define.

On the other hand, faking a dependency might be extremely expensive or even not make any sense at all. For instance, consider you have some persistence layer code you want to test, for instance, a DAO. Creating support code to fake the database can be quite error prone and expensive. Besides, if the sole responsibility of that piece of code is to interact with the database, it wouldn’t make much sense not to hit the database and make sure your query works as expected. In this case, the concept is the same: you must have control over external dependencies. The difference is that now you have a big fat dependency: a database.

How do you keep control of such a thing? By making sure you always know the state of the database before and after your code runs. That means creating all tables from scratch before running, and dropping everything after the execution finishes. Or, at least, making sure all tables are empty before and after your test runs.

But why is it so important to have control over external dependencies?

Consider you have tests running against a given database. Let’s call it “Dev DB”. This DB keeps changing: tables and rows are added and deleted by multiple users or processes. You’ll end up in at least one of these situations:

  • Your tests won’t ever be able to assess the DB state before its execution, preventing them to know what to expect afterwards, resulting in tests that can’t possibly test anything;
  • Your tests constantly break, since the pre-conditions and post-conditions expected by them keep changing;
  • As to avoid breaking all the time, your tests are so vague that even when you introduce a serious disruption or a major bug, no tests break, everything remains green;
  • Since your tests break all the time anyway, no one really cares when your CI server gets red: it could either mean you introduced a new bug or new rows have been added to DB, changing the result. Recent history says the odds are the second one is more likely, so you just sit and wait for someone to fix that line of test code;

So, in the end, having no control over external dependencies makes the team lose confidence on the tests and, with time, they become obsolete and useless.

Automated tests should test one thing at a time

A good pattern to follow when writing tests is the so-called “Build-Operate-Check”. In BDD terms, it translates to the “Given-When-Then” triad. Meaning a test should ideally have 3 sections:

  1. Prepare everything your needs to test to run. Set the ground and the pre-conditions that your test and your production code need for the scenario you’ll be testing;
  2. Run the target method;
  3. Check if the post conditions have been met, or, in other words, verify if the outcome of your production code meets your expectations and if the expected post-conditions have taken place.

On top of those 3 sections, a test should also exercise only one scenario and a method at a time. But why only one method at a time?

Consider you have a test like this:

public class MyTests {
    private MyTargetClass targetClass;
    @Before
public void setup(){
targetClass = new MyTargetClass();
}
    // … other tests omitted
    @Test
public void testMultipleMethods(){
//given
targetClass.createPreconditions();
        // when
String outcome = targetClass.targetMethod();
        // then
assertEquals(“outcome”, outcome);
}
}

Now, what happens if a bug is introduced into createPrecondition? You’ll end up with at least two failing tests: the one that checks whether createPreconditions works as expected, and the one above it. So, even though targetMethod still works as expected, a bug introduced into a production method that shouldn’t have anything to do with the method being tested here will still break it. You end up creating a dependency between your tests, which is not at all desirable.

And why one scenario per test?

Let’s use another example. Typically, your production code may have to deal with several scenarios, such as:

  • the input is null;
  • the input is not null but also not valid;
  • the input is valid and generates outcome A;
  • the input is valid and generates outcome B;

Now, what happens if you test multiple scenarios within the same test method? The very same thing: a bug that just affects the first scenario may start breaking tests that aim at verifying something completely different, and you create a dependency between your tests.

This goes along with some of the things Uncle Bob says in Clean Code:

  • there should be only one concept for test;
  • tests should be independent;
  • tests should be self-validating;

And why do we want to be so specific about what do we test? Well, is there anything more frustrating than not knowing why your test is broken?

We want tests to be small, independent and self contained so that, whenever it breaks, we can quickly and safely spot out what went wrong and get it fixed. If you have to spend hours trying to figure out why a test doesn’t work, people will start disregarding and not fixing it, which, in turn, will lead to a useless test suite.

Be specific about the outcome

Along the years, I have seen many tests written like this:

public class MyTests {
 private MyTargetClass targetClass;

@Before
public void setup() {
targetClass = new MyTargetClass();
}

@Test
public void testOutcomeSize() {
// when
List outcome = targetClass.targetMethod();
  // then
assertEquals(2, outcome.size());
}

@Test
public void testOutcomeIsNotNull() {
// when
String outcome = targetClass.otherTargetMethod();
  // then
assertNotNull(outcome);
}
}

Can you spot the problem within these two tests? At first, they seem to be OK, right?

But are they really checking if their target methods work as expected? Or are they just testing if the target methods provide any outcome at all? Take a look at this improved version:

public class MyTests {
 private MyTargetClass targetClass;

@Before
public void setup() {
targetClass = new MyTargetClass();
}

@Test
public void testOutcomeSize() {
// when
List outcome = targetClass.targetMethod();

// then
assertEquals(2, outcome.size());
assertEquals(“first”, outcome.get(0));
assertEquals(“second”, outcome.get(1));
}
 @Test
public void testOutcomeIsNotNull() {
// when
String outcome = targetClass.otherTargetMethod();

// then
assertNotNull(outcome);
assertEquals(“expected outcome”, outcome);
}
}

Can you see what a huge difference the new lines make?

The first version checks if an outcome is generated, whereas the second one checks if the right outcome is generated.

The first version would consider a list containing something and nothing to be valid, regardless of the order they show up. But what we really want is to make sure that targetMethod returns a sorted list, where first is the first element and second comes last. And that’s a huge difference from a functional perspective.

Imagine your customer logging into your system to retrieve the invoices from the last two months in order to pay you, and getting instead the last two songs you have heard in your iPod? That’s money going down the drain!

The same thing apply to the second test. Imagine your user tries to visualize information about his most important customer and gets information about a customer from his competitor. That could cause serious issues to both of your customers, not to mention their confidence on your product.

The same principle applies to checking exceptions. What is wrong with the test below?

public class MyTests {

private MyTargetClass targetClass;

@Before
public void setup() {
targetClass = new MyTargetClass();
}

@Test(expected = Throwable.class)
public void testOutcomeSize() {
// given
targetClass.setPrecondition(GENERATE_BUSINESS_EXCEPTION);

// when
targetClass.targetMethod();
}
}

At first, it seems alright, no? But can you really be sure that your production code is failing for the reason you wanted it to fail? Can you differentiate a NullPointerException from the business specific exception that your code is supposed to raise? What about improving the test a little bit to look more like this?

public class MyTests {

private MyTargetClass targetClass;

@Before
public void setup() {
targetClass = new MyTargetClass();
}

@Test(expected = MyBusinessException.class)
public void testOutcomeSize() {
// given
targetClass.setPrecondition(GENERATE_BUSINESS_EXCEPTION);

// when
targetClass.targetMethod();
}
}

Looks better, right? Now I can be completely sure that my code is throwing the exception it’s supposed to throw on that specific scenario. And, this way, if someone changes the code in order to not throw that exception anymore, I will have a test telling that person to think twice about how much sense that makes. And I can also be sure that, if aNullPointerException is thrown, my tests will not comply with it and will also break.

For the same reason, don’t write tests that interact with more than one target class. If something breaks on class A, you want to be precise about the impact of that change, you don’t want to fix a million tests that were checking something completely different and which context had nothing to do with class A. Always prefer being minimalist about the scope of your tests: test one class and one method at a time; use different test classes for different target classes.

Some people will even go further and state that tests should have a single assertion. Although I respect this point of view, I don’t really like it. And the reason for that is a piece of production code may have several outcomes. For instance, a given method may be responsible for storing a change in the database and invoke a notification component to send an e-mail. One action fired two consequences.

In my eyes, tests are also about context. So, if in the context of a given call to a target method, two consequences have taken place, I’d rather have this properly checked under the same test and context, instead of having two almost identical tests (with same preconditions and action being fired) with just different checks for the expected outcomes. But this is really just a matter of personal preference.

Tests should be fast

An often neglected aspect of automated tests is that they also influence on the psychological side of IT. Fast feedback is a powerful tool to stimulate a team into using tests. Whenever someone changes anything, that someone can quickly know the side effects of that change. If that developer would have to wait half an hour to know that, he/she would most likely just push the change before the tests are done and let the rest of the team down, having to deal with a break they didn’t do neither asked for.

Fast feedback is also very powerful when you have a large collection of broken tests: when people can quickly see the positive outcome of the effort they spent on fixing a couple of tests, it makes them feel good about it. When that large broken test suite is finally completely fixed, it’s a victory for the team, as they will start to have a very reliable way of knowing what is the impact they didn’t think of in the rest of the codebase.

That’s one more reason to minimize integration and functional tests. Those kinds of tests add great value, but the thing is that they can be really slow, specially when there are multiple of them, and, therefore, they should be kept to be a bare minimum.

If you cannot give up on a large suite of integration and functional tests, what you can do is make unit, integration and functional tests run in separate moments.

In a scenario where it takes more than 10 minutes to run you entire test suite and then push some code changes, odds are your team will give up on waiting and will push untested code all the time, letting Jenkins do the hard work and, hence, breaking up the code base completely.

In this case, what you can do is have separate commands for running each type of test and agree with your team what is the bare minimum that needs to be run before pushing something. In teams I worked, the general agreement was that running unit tests before pushing was mandatory, since we had not so many integration and functional tests, and they happened to be quite stable. But this should be really agreed upon on a case by case basis, depending on what the team has to deal daily.

There’s so much more out there!

I guess that’s all I had to share. As I said, there’s a lot I don’t know yet, and I have no intention to pretend otherwise. But you don’t really need to worry about using all (or any) of this. If you just use TDD, you get all of these topics instantly covered for free. But TDD is a completely different topic, which I’m very far away from mastering.

What about you? What other test related good practices do you have? What has worked or not for your team under a particular circumstance? Feel free to add comments and enrich the discussion.