Avoid bad bugs to keep catching good bugs!
Characteristics of a sustainable test automation garden
A garden attracts bugs, especially a garden of test automation! Just like in a real garden, a test automation garden can attract good and bad bugs:
- Good bugs in the software
- Bad bugs in the test
👆 What is this ‘garden of test automation’? Find out in this blogpost, and more! How my perspective grew to an ‘ecosystem of test automation’, a context driven approach that can help to ensure sustainable test automation.
Both type of bugs needs to be minimized, but especially the bad ones. If a bug in the software is caught, this means that the purpose of the test set is fulfilled, it adds value! A bad bug confirms the exact opposite. With every bug found in the test, the confidence and reliability in the test set decreases. Over time this leads to devaluation of the test set, to the point where it no longer adds any value at all.
This article covers three characteristics of healthy automated tests;
descriptive. Concepts and guidelines, but more importantly how to apply those in practice to ensure a sustainable test automation garden. One that avoids bad bugs, catches good bugs in the software and keeps adding value with test automation!
The first characteristic of a healthy garden. Atomic means to focus on one goal, which unlocks a lot of advantages. If an atomic test fails, you know exactly why. One goal also enforces to keep a test small and fast. The endeavor for atomic tests is not a new concept. However, the art is to apply this in practice.
When googling atomic tests, you will find various guidelines for specific types of tests. I believe atomic tests can be applied to all layers of test automation. The context and characteristics of that layer determines the best suitable goal of the test. Keep it simple and focus on what the test is intended for:
Validating the smallest testable unit, is what a unit test is about. So the focus should be on specific behavior and paths within this unit. Boundary values help to optimize coverage of conditions in the code.
Integration and connectivity are the keywords for an integration test. It comes to sending and receiving messages. The validity of the structure and content of those messages. What if you send an invalid request, or another service is down?
In reality I still see a lot is tested with a GUI test, but the main subject is and remains the user interface. It’s about user interaction and specific browser behavior. I believe it’s most challenging to get GUI tests atomic, because one goal can be quite a user flow. Let’s zoom in an example.
This example shows a user flow I wanted to automate at my current context. Port of Rotterdam; where ships sometimes need to request their visits to the harbor. The one goal of this test is to view an approved request (I will spare you the details). In order to execute this test and reach this goal, I need to follow these steps via the GUI:
- User logs in via a 3rd party authorization service
- User fills and sends a new request
- The request is processed by a second external service
- The request is assessed by a third external service
Great this test has one goal. However, many steps with associated dependencies means that this test can fail in many places. This will definitely lead to bad bugs in the test.
The outcome of the test shouldn’t be influenced by something outside the context of the test. So, I need to isolate the test as much possible from other functionalities and services. The goal is to achieve independence, which is a very important characteristic of a healthy test. Even an absolute must have when running tests as part of an automated delivery process, or as a nightly build. Looks like I have hedge trimming to do!
Functional & technical hedge trimming
The first cut splits the user flow in multiple functionalities to focus on what is actually important. A rule of thumb is to focus on one webpage or one action/verb. In this case ‘login’, ‘send’ and ‘view’.
👆 Notice that every functionality has its happy and unhappy flows: all good with a successful login, but what if a user logs in with invalid credentials, or tries to send a request with not all mandatory fields filled? Covering those scenarios within the focus of the functionality will keep your tests small.
I only focus on the view of an approved request, nothing more, nothing less. All steps for viewing a request are irrelevant, meaning I don’t need to run them through the GUI. I set them up programmatically, which brings me to the second cut.
This technical cut minimizes dependencies by simulating all connections with the outside world using mocks and stubs. I setup a logged in state and a request which is already processed and assessed. In that way I have control over my test data, and I can directly navigate to the page to view the request and focus on the one goal of my test. In practice I see this dramatically improving the speed and stability of a test.
Part of setting up a test programmatically, is to discover how the application is working ‘under the hood’. When I start automating a new web application, I execute the user flows with the network tab of the developer tools open. This gives me a good view of the requests and responses of the backend and when they are triggered. The content of a response is a good starting point for a stub.
A descriptive test describes the expected behavior of the application under specific circumstances. This means that a descriptive test…
- … serves as living documentation
- … provides insight in functional test coverage
- … builds the confidence in quality assurance
But how does this work in different contexts?
The sentence of one sentence…
For a lot of test automation tools the name of a test is just one sentence, so use it wisely! What works for me is to start with the expected behavior, then the specific circumstance that determines the outcome. This really forces me to think about the goal of the test from the best suitable perspective.
Gherkin helps to write descriptive tests in the structure
When (action) &
Then (expectation). Another concept that is not new, but definitely an art to apply in practice.
Popular test automation tools translate Gherkin test scenarios to plain code, like
SpecFlow for .NET. In practice I see that these tools are often misused.
Huge Gherkin scenarios like the one you see on the left, with sentence representing a click or another GUI action. Because that’s handy to reuse for test automation, right? Maybe a short term solution to quickly automate new scenarios, but the joy won’t last for long. Reusing all steps lead to long user flows with a lot of dependencies, of which it’s difficult to see the purpose and will ultimately result in bad bugs in the test.
With Gherkin the strive is for a compact scenario where the goal can be seen at a glance. In order to make optimal use of Gherkin, it is important to understand its concept and context. Gherkin originated from the behavior driven development (BDD) methodology.
👆 The book ‘BDD in action’ by John Ferguson Smart really helped me to get the big picture of BDD and understand the meaning of Gherkin
Sustainable test automation means to keep adding value with test automation. Catching good bugs confirms the added value of a testset. Bad bugs damages the trust in the testset, and must be avoided to continue catching good bugs. Keeping a test
descriptive help to avoid bad bugs. No new concepts, however an art to apply in practice. Consider the context of these concepts and your environment to find the best suitable solution, and your test automation garden will continue to flourish!