TDD in Android
Test Driven Development is a controversial topic among software engineers and it is not rare to find very strong opinions in favour and against it. I am on the side of the ones in favour of it for the majority of the cases. This article is not focused on advocating in favour of TDD though, it is meant to explain how to properly implement this methodology to build an Android application.
Android or more specifically the Android SDK will be just an example, if the reader decides to replace it for Angular, React, iOS SDK or even Spring or Django, the principles and examples that will be shown are still going to be applicable. Also architecture concepts that are not specifically related to Android will be discussed, because they are needed to guarantee the tests will run fast enough for TDD to be feasible.
Example Application
Throughout this article we are going to focus on an example, so it is easy for us to associate TDD with real life:
In order to organize my daily activities I want to keep a task list
Empty Task List
Given I have no tasks yet
When I open task application
Then I see Task List screen
And I see an empty task list
Add Task Action
Given I see the Task List screen
When I click Add Task button
Then I see Save Task screen
Save Task
Given I see Save Task screen
And I write call mum in the description
When I click Save button
Then I see Task List screen
And I see call mum in the task list
This example uses BDD to describe an application. As we can see it doesn't describe an Android, iOS or Web application, but it focus on the behaviour that should be implemented. Many times we fail to use TDD to solve a problem because we don't clearly describe the behaviour first in a way that is agnostic to technology.
TDD works best when the architecture helps to isolate technology details (like the GUI, Database, HTTP, Bluetooth etc..), because those details are slow and flaky to test automatically.
Also as developers we can easily get distracted with technology and lose focus on the business value that is being added to the application. The scenarios above will help guide the creation of the tests as we will see next.
Test Driven Development
Create a failing test. Make it pass. Refactor.
These are the fundamentals of TDD. Although there are plenty of resources to learn the basics, there is still misunderstanding around how to properly implement an application using TDD. Let's consider the following test pyramid:
The pyramid is about how many automated tests we will write for each category (except by Exploratory tests which are usually performed manually). Component and Integration tests are the most misunderstood ones, therefore applications tend to have only Unit and/or End-to-end tests. By Martin Fowler's definition in Testing Strategies in a Microservices Architecture a Component test integrates all the parts of an application that are not slow. Slow parts are external services accessed via I/O operations like a database or a http server.
Integration tests check whether our application works with external services by calling them, what can be slow and flaky. We can design our code so Integration tests focus on verifying that our application respects the contract defined by these services. There are various ways to specify the contract, but commonly it is specified via SQL in case of a database or via a URL and JSON in case of a http server.
Top-Down vs. Bottom-Up
When approaching a new feature it is a common mistake to implement TDD beginning at the bottom of the test pyramid. This is not a good idea because TDD is essentially a design methodology, since it helps us to find how to solve a problem as we add new tests. In order to use a bottom-up approach we need to make a design decision first and then start building on top of that decision.
For example in the Empty Task List scenario a bottom-up approach could start by creating a getTasks method in TaskPersistence interface and then create a TaskSQLiteDatabase implementation. Doing that by writing the tests first is possible, but that wouldn't really be TDD, because we started by making a design decision. We decided to create a separated class to handle the persistence of the tasks. Also we decided to create an abstraction with TaskPersistence interface and that a SQLite database would be used to persist the tasks.
Although some of these decisions will be made at a certain point in time, with TDD we should make them motivated by the behaviour we want to implement. This approach avoids accidental complexity because it is very easy to make mistakes when we create abstractions up-front. TDD is about "forgetting" our past experiences (architecture approaches, past projects and past mistakes) and letting the tests guide us. When we reach the Refactor phase we can apply our experience by aiming to be as lean as possible.
In a top-down approach we would start at the top of the pyramid (or as close as we can to the user) and then derive the rest of the application from it. Since we are going to use TDD we shouldn't start with Exploratory tests, because they are manual. Also End-to-end are not the best option, because they are flaky and take too much time to run. Starting with End-to-end tests would make our feedback loop too long. The Component tests are the best starting point, because we start from the perspective of the user and still can have a short feedback loop.
Graphic User Interface
In Android the GUI can be tested with instrumentation tests using Espresso which requires an emulator, what makes them slow and flaky. If we use Robolectric the tests run much faster, but not as fast as if they were written in pure Kotlin. Robolectric sets up a fake Android SDK in the local JVM when the tests run, which increases the tests run time. In a Web application using React it wouldn’t be the case, because React tests run quite fast we could easily write tests from the GUI level.
Since only the happy path will be tested through the GUI, Robolectric is fast enough to allow us to test drive an Android application. Most of the tests will be implemented in different layers of the test pyramid anyways, leaving the GUI out.
Let's code!
So we will start the development of the story above. Save Task scenario is the one that brings value to the user, once it is completed the story will be done. The problem is we have a greenfield application and starting with Save Task would make our feedback loop too long.
Add Task Action and Empty Task List scenarios are much simpler. It is more natural to implement Empty Task List scenario first, since it is the starting state of the application and then Add Task Action scenario. Empty Task List is an edge case, it doesn't bring that much value to the user, therefore I will test-drive its implementation without the GUI. Also we should keep edge case scenarios in lower levels of the test pyramid, leaving the happy path scenarios in higher levels.
Since this test keeps the UI out, some design decisions have to be made while writing it. A callback is registered in TaskApplication to receive TaskListScreen with the list of tasks.
As we see the minimal possible changes were made to make the test pass. Let's tackle Add Task Action scenario next:
We had to add a Screen interface to create this test, in order to reuse withScreenCallback method for SaveTaskScreen.
Now we also can apply some refactors. The screenCallback variable can have a default no action value, then we can avoid the ? syntax and call the function directly. Also we can move Screen, TaskListScreen and SaveTaskScreen to their respective files.
Finally we can attack the Save Task scenario. Now we are going to add the UI to the test, making it more complete.
This test fails, because we don't have a MainActivity yet, and it is not declared in the AndroidManifest. Also the ids we are referencing don't exist. To make it pass some design decisions have to be made. Since there will two screens in the application we could use Fragments, Activities or Views to represent them. I will use Views and a single Activity, for personal preference.
In order to make the test pass I had to create many classes (I have omitted the xml files and the list adapter implementation). The feedback loop time was not ideal, but the next stories will have less code associated because the foundations will already be in place. In order to mitigate a long feedback loop in greenfield applications the first story has to be made as small as possible, while still providing value to the user.
A tasks variable was created in TaskApplication, which means the tasks are kept in the memory. If the user kills the application and reopens it the tasks will be gone. We need to make sure the tasks are persisted, therefore we need a new scenario:
Persist Saved Tasks
Given I save call mum task
And I close task application
When I open task application
Then I see call mum task in the task list
In an agile team it is very normal for scenarios to be added (and removed) while the story is in development. It is impossible and counter-productive to try to think of all scenarios at the beginning. As the story evolves Designers, QA, Stakeholders and Engineers will come up with new scenarios. As long as we focus on the minimum viable product this is fine. In our case it doesn't make sense for the user if the application doesn't persist the tasks.
The test is implemented without the UI, because this is an edge case and also because it is way easier to simulate the scenario.
I have chosen to use Handler to implement asynchronous behaviour and to post back to the main thread. Posting to taskHandler in every public method ensures asynchronous actions occur sequentially (avoiding inconsistent states). Also SQLite was chosen to persist the tasks. Although this implementation makes the test pass, it is not very clean, so let's refactor it to separate Handler and SQLite dependencies away from our application logic.
Now when we read TaskApplication code we can focus on the business logic. There would be some other scenarios related with back navigation and possible memory leaks in MainActivity (because we currently don't remove the callback in onDestroy) that should be implemented using Unit tests.
Edge cases related to errors are good candidates to be implemented using Unit tests as well, because setting up a Component test for them would be cumbersome. To simulate non-standard behaviours in a Component test requires the injection of mocks or fakes, which makes the test white box. Component tests have to be black box so after a big refactor of the application, we can run them and feel confident the system still works as expected.
Since Robolectric has an option to run SQLite database in-memory, our tests run against the real database implementation, therefore we don't need to write Integration tests. If we were querying the tasks via HTTP then we would need to fake the server in our Component tests. In that case an Integration test would have to be written to validate that the server behaves as expected. For example in our Integration test, we would instantiate TaskRepository (that would talk to a server via HTTP instead of to a database via SQLite) and call its methods. If TaskRepository is able to save and obtain tasks from the server under test, the test would pass.
Conclusion
TDD is about navigating the test pyramid and deciding in which layer a behaviour should be implemented. In order to do that, first the behaviour must be very clear. We should favor implementing a new behaviour at the top of the test pyramid, so we have all the happy paths covered by integrating everything that is not slow in our application.
By following this approach we can avoid accidental complexity and keep ourselves focused on adding value to the user. Also the development speed remains stable over time as the source code scales, because coverage gives developers confidence to change the application.