Abstraction in software engineering — Tests

Tiago Bevilaqua
The Startup
Published in
12 min readJun 10, 2020

--

Evolution of Mondrian Paintings between 1908–1921. From Top Left — 1. The Red Tree (1908–1910) 2. The Grey Tree (1911) 3. Flowering Apple Tree (1912) 4. Composition in Blue-Grey-Pink (1913) 5. Composition with Gray, White and Brown (1918), 6. Composition with Large Red Plane, Yellow, Black, Gray and Blue, 1921.
Source: Wikiart.org, except 3. abcgallery.com & 4. paintingdb.com

Following my previous posts regarding abstraction in Architecture designs, Abstraction in software engineering — Architecture, and in the applications, Abstraction in software engineering — Application, it’s time to talk about these concepts towards testing. We’ll have a look and understand why many test approaches out there can be overwhelmingly redundant and usually waste a great deal of time. We’ll also understand how higher-level testing scenarios, Behaviour Driven Development, can assist with what we really need — make sure that our application works as it should without notorious overheads.

Wasting time when testing applications redundantly

Time is one of the most scarce resources in software engineering. So, I ask, who has time to implement all possible “nice to have” tests in every single application that’s developed? I guess we agree on this one — No one! There are just countless terms and ways that you can test applications nowadays. Here’s some of them:

  • Unit Tests
  • Integration tests
  • Smoke tests
  • Regression tests
  • Interface tests
  • Component tests

The glimpse above is a reminder of how extensive the list of tests can become, and there’s a lot more that I haven’t highlighted here. Of course, you’d never implement all possible tests because there’s some overlap amongst them. So, which one should you use? Does it depend on your application or the programming language in use? Not really. Tests should be abstracted concepts and not application-specific. The main idea here is to come up with a generic, yet efficient way to build tests that will fit the specific application needs, as well as be versatile reusable for new integrations and implementations. Moreover, the intent is to maintain these tests as little as possible throughout an application’s lifetime so that we don’t have to change them every time a specific internal logic changes that has no benefit to the overall use case of the application . We’ll explore more about this reusability and abstraction later in this post. For now, let’s have a look at what you should be very clear regarding tests.

Test coverage does not mean anything

Why having coverage does not mean anything? If I have my application 100% covered, I don’t have bugs, right? Wrong! What’s the benefit of testing methods that don’t have any logic? Even worse, what’s the gain in validating features of the programming language in use? We might as well not use that language then. To elaborate on what I’m saying, let’s have a look at the well-known java POJO (Plain Old Java Object) below.

public class Person {    private String name;
private String surname;
public Person(String name, String surname){
this.name = name;
this.surname = surname;
}
public String getName() {
return name;
}
public String getSurname() {
return surname;
}
}

By testing any of the methods above, be it the constructor, the getName or the getSurname, we’re not testing any logic. Instead, we’re just testing the JDK itself, which has no usefulness to us at all. In other words, we’re wasting time and effort to achieve those so-called 100% coverage marks.

Instead of doing the aforementioned, it would be a lot more intelligent to create a unit test that actually tests something. For example, consider the snippet below as an extension of the class above.

public String getFullName(){
return String.format("%s %s", name, surname);
}

Testing this function is much better than the previous example because we’re actually testing our logic, which is “name + surname”; however, even though unit tests can be helpful, they can represent a hefty burden when changing and maintaining applications in the long term. Additionally, we can achieve the same benefit of unit tests by using BDD, which is what we’ll talk later on.

Why unit tests can become a heavy burden

Unit tests and practices, such as TDD, Test-driven development, are undoubtedly useful during the development and maintenance of systems. They assist us with the introduction of new features and refactoring of the code. So, why am I highlighting that sometimes these types of low-level testing strategies can be a heavy burden? It’s simple. It’s because these tests unavoidably have to change every time we change our logic as they’re 100% coupled to our code. So, in reality, we’re duplicating or sometimes triplicating our codebase to test successful and unsuccessful scenarios that specific functions may have, which end-up being a “catch-up game” between the tests and the code itself. Nevertheless, this repetition is found everywhere in the application because every function that has logic in it is tested. For example, let’s consider the function below and its candidate test scenarios.

doesSomething(parameter){..logic that does something
returns something
}# testingDoesSomething@Test
when_doesSomethingReceivesCorrectValues_returnCorrectValues(){
correctResult = doesSomething("CorrectValue")
Assert.assertEquals(correctResult, "shouldBeCorrectValue")
}# Expect@Test(expect = Exception.class)
when_doesSomethingReceivesIncorrectValues_throwsException(){
correctResult = doesSomething("IncorrectValue")
}}

This is a extremely simplistic scenario where our function has only one input parameter and has two output:

  1. Return the correct value
  2. Throws an excerption if that’s incorrect

Unfortunately, this isn’t the real scenario of most functions developed. Instead of having one input parameter, real-life functions might have at least several and that’s when things start to get complicated. Let’s have a peek at how cumbersome functions and tests in the real life can be while paying attention at the plethora of repetition involved during the development of testing scenarios.

#Changing functiondoesSomething(parameter1, parameter2, parameter3){..logic that does something
returns something
}

Before, our function had one parameter, so it was easy to test it with successful and unsuccessful test scenarios, but now we have 3 parameters. What does it mean? In order to test our function thoroughly, we need to think of any possible scenario. In other words, we’ll have eight different test scenarios to make sure that our function is tested properly. These scenarios would go like this:

doesSomething("correct", "correct", "correct") # OK 
doesSomething("correct", "correct", "incorrect") # Exception
doesSomething("correct", "incorrect", "correct") # Exception
doesSomething("correct", "incorrect", "incorrect") # Exception
doesSomething("incorrect", "correct", "correct") # Exception
doesSomething("incorrect", "correct", "incorrect") # Exception
doesSomething("incorrect", "incorrect", "correct") # Exception
doesSomething("incorrect", "incorrect", "incorrect") # Exception

It’s clear that the more complex the function is, the more tests scenarios we’ll have if we want to make sure the function works correctly. If we stop for a second and think “how many functions do I have in my application”, this number would be scary, and if we’re testing all of them properly, the number of tests will easily be around 5x higher than the number of functions. So, is it really worth it to keep investing time and effort into Unit tests? Let’s have a look at the design of our application again to understand how many unit tests we’d roughly have to implement.

Abstraction in software engineering — Application

Given the above integrations within our application, we’d at the very least have to implement the below unit tests:

I don’t mean to scary you, these 36 unit tests are is still a very simplistic and underestimated assumption. Unfortunately, these 36 tests are very likely to change as we update anything in our application. For example, if we change anything in our underline code or any function contract, the tests won’t even compile; therefore, they’ll will break, meaning that we’ll have to adapt them to our code or the other way around, TDD. Regardless, we continuous have to change line of code. What if we had a better and higher-level testing approach which we don’t need to change these tests every single time we change a simple function in our application? We’ll have a look at it next.

Abstracted tests — Behaviour-driven development

Firstly, what are abstracted tests? These are tests that utilise abstraction and decoupling principles and target the end-to-end flow of a specific high-level use case in a given application, such as making sure that we can save and take dogs (I hope you noticed that “save” and “take” are our interfaces/APIs). What’s the benefit of implementing them over their lower-level counterparts, like unit tests? Regardless of how much the underline implementation changes, our tests do not have to change; it’s even possible to replace the whole language that the underline application run onto, yet the tests wouldn’t be required to be altered.

Let’s quickly remember our API contracts defined in the previous post, Abstraction in software engineering — Application.

API Contracts

myapp.com.au/dog/saveQuery params
name, breed, ownerEmail
myapp.com.au/dog/takeQuery params
personsName, dogsName, dogParkAddress

Now, you can think of behaviour driven development as a way to describe your scenarios regarding its behaviours without having to understand its implementation (Declarative vs imperative programming), even your Product Owner can write the tests if they want to. Imagine the following scenarios:

# Successful scenario
Given: I want to save my dog
When: I send my ownerEmail email@email.com, my dogsName Skip, and tell that his breed is LabThen: My dog will be saved successfully and I'll received a 201 Created as a response
AND
I'll check the database to make sure that my dog has been saved
# Unsuccessful scenario
Given: I want to save my dog
# Sending wrong email
When: I send my ownerEmail wrongEmail.com, my dogsName Skip, and tell that his breed is Lab
Then: My dog will be saved successfully and I'll received a 400 Bad Request as a response
AND
I'll check the database to make sure that my dog has NOT been saved

Note that in our scenarios, we added the variables the were interpreted by our testing engine and utilised to trigger HTTP requests and connect to databases and run queries to double check the end-to-end result of our flow. To illustrate, the above definitions would created trigger the following executions:

#Successful
1 - Create HTTP POST request
request myapp.com.au/dog/save?name=“Skip”&breed=Lab&ownerEmail="email@email.com2 - Check the database to make sure dog HAS being saved#Unsuccessful1 - Create HTTP POST requestmyapp.com.au/dog/save?name=“Skip”&breed=Lab&ownerEmail="wrongEmail.com2 - Check the database to make sure dog has NOT being saved

We always need to remember that what’s essential to our customers and users is whether they can save their dog or take them to the dog park; they don’t care about the underline implementation of our system, so why should we have this kind of concern when writing tests?

Let’s take one step and understand how we abstracted the whole underline engine in our test use cases above.

  1. We know that HTTP endpoints are the first entry to our system, which is where our input comes from.
  2. We know our application has complex logic and this trends is set to continue, but we don’t care about it because we’re writing high-level testing scenarios
  3. We know that we have third-party integrations that our application depends on, but again, we don’t care about it because writing high-level testing scenarios, we’ll mock them all (more on this in the next chapter)
  4. We know that the last state of the journey of requests received on the endpoints is that these values are saved to the database atomically. Therefore, either the value is there (successful request) or not (unsuccessful request).

Given that we went through the first and the last hop of our application and understand where the state of a successful or unsuccessful request is saved, we can clearly abstract the rest as the following picture illusgtrates.

Okay, we’ve done it, but what are we gaining when implementing abstracted tests?

  1. We don’t worry about the underline application implementation.
  2. Testes don’t need to change when the underline implementation does.
  3. We decouple our tests from our code (tests could be written in a different programming language).
  4. We test behaviour, not specific functions.
  5. Our tests have meaningful value to developers, POs and Stakeholders.
  6. We can build different tests on top of these abstracted tests quickly, such as performance tests by calling the API as many times as required in a short period of time.

I guess that it’s now clear that even if the underline implementation changes, the behaviour is the same, which means that we don’t need to change our tests high-level tests. Additionally, we can then reduce our test scenarios significantly as well as add meaningful insights to them. Even though it sounds superb, let’s try to deceive our newly implemented tests with the following scenario:

  1. Our developer is told to improve the performance of the save dog function because it’s slow.
  2. He makes the validation a lot faster; however, they made a mistake when upgrading it and removed the email validation that requires emails to contain the symbol “@”.
  3. They’re about to run the tests, would the tests catch the mistake?

Let’s see what’d happen in real life. Here’s the test scenario

# Unsuccessful scenario
Given: I want to save my dog
# Sending wrong email
When: I send my ownerEmail wrongEmail.com, my dogsName Skip, and tell that his breed is Lab
Then: My dog will be saved successfully and I'll received a 400 Bad Request as a response
AND
I'll check the database to make sure that my dog has NOT been saved

Given that the email validation for the “@” symbol isn’t in our system anymore, the HTTP response wouldn’t be 400 Bad request anymore, instead it would return 201 Created. Additionally, the value would have been saved to the database, so again, double-fail.

Now I ask, did our developer had to write unit tests to catch this error? Was he covered of his own unwilling mistake? That’s the major gain we get when abstracting low-level details from our perspective and invest our time on what really matters, the use cases.

There’s still a problem with this approach. Even though everything seems to be working perfectly, real-world applications has various integrations with other applications, which turns them incapable of functioning by themselves. In this way, how can we run tests in dependent systems? Let’s talk about it.

Third-party dependencies when running tests

What do we do when testing an application that relies on various other integrations, such as our current EmailValidationand DogParkValidator dependencies? Sometimes these dependencies might be from different teams or vendors, and they might take hours if not days to reply to requests in case their systems go down. So, what do we do? Do we wait for them to fix the problem and halt our development until then? It does not sound smart nor agile. We Mock it up!

Firstly, we need to have a clear decoupled view of what we’re testing as below:

Secondly, we need to understand what we can control, in green, and what we can’t control, in red, during our tests so that instead of calling the real third-party systems that we can’t control, we’ll mock their responses.

Thirdly, we need to isolate the dependency on these systems by knowing the expected result they provide. In our scenario, we were smart and segregated all third-party calls into a single layer of our application, the Repository layer. So, we need to Mock the API calls “Is email valid” and “does dog park exist”, which will reflect on our test case scenarios. Say, when the email is “wrongEmail.com” the EmailValidator will return False. Similarly, when the DogPark address is “123 wrong address”, the DogParkValidator will also return False, otherwise they return true. Given that we only need to understand their responses depending on their inputs so that our application can respond to them as well, we just need to test a positive and a negative scenario against these third-party APIs. Remember, We don’t want and it isn’t of our interest to understand specifics about these third-party APIs, only what comes in and what comes out.

Once these three steps are in place, it becomes trivial to moke these systems responses like below:

# Mocking emailValidator definition scenario
Given: The EmailValidator System is called
When: The email received is wrongEmail.comThen: we'll receive a 400 Bad Request as a response
AND
the body message will be false
# Mocking DogParkValidator definition scenario
Given: The DogParkValidator System is called
When: The address received is 123 wrong addressThen: we'll receive a 400 Bad Request as a response
AND
the body message will be false

Done, we’ve just isolated our dependencies and instead of calling them, we’re mocking their responses based on our use cases. The possibilities here are unlimited, it only depends on how much time you have to mock and develop these scenarios; however, the gain is enormous. We don’t rely on any external system to test our system; we can move faster and uninterruptedly with our deliveries

Moral of the story

Nobody has an unlimited amount of time and money to implement every single possible test approach that exists, and what looks good in theory might not fit real-life use-case scenarios. Think about the way tests are implemented and how to always make the most of them with the minimum possible refactoring required. Always think about reusability, “Can I build other tests on top of my current tests, say performance tests”. If the answer is yes for low maintenance and reusability, you’re doing a great job.

Next steps

Review and apply these concepts on your real-life applications, try to figure out which test engines can be reused across your systems and get rid of the ones that can’t — Develop and test your applications in a way that you don’t need to refactor it at every single code change.

--

--