The iOS Testing Manifesto
Table of Contents
- Rules For Writing A Good Unit Test
- Test Driven Development
- Dependency Injection
- Mocks, Stubs, Fakes, Spies, Dummies
- Model Testing
- View Testing
- Controller Testing
- Testing Other Objects
- Other Frameworks
Most engineers agree that testing is good. There have been so many articles and blogs about the benefits of testing that they’re too numerous to list. But what does it mean to write a unit test in the iOS environment? What makes a good unit test? What’s worth testing?
iOS developers are notoriously bad at writing unit tests — in the sense that as a community, we don’t seem to write too many of them. In 2013, I was vaguely aware of SenTestingKit — and like most other iOS developers, didn’t use it. I had a number of rationalizations about why I didn’t write tests — from things like “I have types, and so I’m not as concerned about things going wrong like in Python or Ruby” to “Why should I waste time verifying my code works when I already know that my code works?”
A few years later, I had realized that there’s no good replacement for unit tests. Testing isn’t a nice-to-have, it’s an absolute game-changer. Working in a well tested codebase gives me the confidence to make changes and experiment without fearing that I’ll break something somewhere in the application. I know that if something goes wrong with my refactor, or with the integration of my new feature, I’ll likely have a wall of test failures notifying me of my mistake. Most importantly, it can make new-comers to a codebase understand the intent behind each function.
Rules For Writing A Good Unit Test
- Each individual test should be fast.
- No networking in unit tests. If a unit test performs a networking event, it’s not a unit test!
- Tests should be independent and isolated. Tests should not mutate state for other tests, and they shouldn’t perform tear-down or set-up for one another.
- Tests should be repeatable. A flaky test causes more harm than it’s worth. Either fix them or delete them.
- No dependencies out of your control. Don’t have tests rely on a fixed system date, timezone, or otherwise — these things should either be in the testing suite or the testing environment. Stubbed data shouldn’t be reliant on near-future dates.
In addition, a good test has the following shape:
Let’s take a look at what’s going on up there:
- Naming. Every XCTest function starts with
test. The underscores delimit logical groupings. The second logical group is the behavior being tested, and the third logical grouping is the context under which it’s being tested. A fourth logical grouping can be added if the context merits it.
- Setup. Explicitly listing the dependencies needed to put an object under test in the desired state.
- Behavior Modification. Calling the method, or setting the value that causes the change in the object under test.
- Capturing Changes. When comparing two values, explicitly capture both the expected and actual values. When setting the expected value, use fixed data if possible.
- Assertion. This is where our assumptions about the behavior of the class are tested against the actual behavior of the class.
When tests have a fixed format and shape, it makes it easy for future developers to assess the goal of the test and usage of the class/function. Tests will often serve as a form of documentation for functions and objects to future developers.
Test Driven Development
When writing unit tests for our functions and classes, it helps to borrow a concept from our functional-programming friends and think of functions in terms of their signatures. For example, if we need a function to generate the text for a label from some
TrafficLight model, our signature is
(TrafficLight) -> String. We always begin by writing the most simple, empty-form of our function, writing failing tests, and then iterating until the tests pass.
Let’s observe what’s happening above:
- We wrote our function in terms of its signature, returning a base case that we fully expect to fail.
- We wrote all of our expected test cases for this function.
Once we have our failing tests, it’s important to actually run them and watch them fail; it’s impossible to make tests go from failing to passing if they already pass. Once they fail, begin iterating in the production class until all of our test cases are successful.
Writing Tests for Existing Classes With No Tests
As engineers, the majority of our jobs isn’t to write new code — it’s to maintain and modify existing code. Quite often, legacy code doesn’t have unit tests which can make working in it a dangerous task that’s prone to regression. The first step to working in a legacy file, or a file with no unit tests, is to write tests targeting the existing functionality that will be changed. This means understanding the subset of functionality of that file that will be affected, and writing tests that document its current behavior.
When writing tests for existing classes with no coverage, it’s important to still double-check that tests are valid by modifying them so that they fail when we expect them to. If we only write green tests, we are unlikely to have a good understanding of the original design and intent of the class. We may have to modify some of the implementation of the original class in order to make it easily testable — it may not have been written with dependency injection in mind, or may have a large list of implicit dependencies. When modifying the original class for testability, we should try to minimize the number of changes so that we disturb the original code as little as possible.
Once existing behaviors are documented, now we can begin the standard TDD flow.
If an application is entirely — or mostly — untested, a great approach is to gradually add in tests to classes as they are modified, for the parts that have been modified.
Martin Fowler has written extensively about dependency injection.
In general, when designing our classes, try to set it so that the properties held within a class are passed in through the constructor. If our constructor grows large, allow setters on the object.
As a rule of thumb, have constructor-injection for private variables (preferably with defaults in the constructor), and setter-injection for public variables.
Do not sacrifice access-control for testability. When testing private members, hold a reference, inject them, and test their internal state accordingly.
Mocks, Stubs, Fakes, Spies, Dummies
It’s important to isolate our tests to our local environment. If objects have references to other things — network services, delegates, etc… — it’s important to inject a simplified or controllable version of that object. Test objects that have different purposes have unique names. The terminology itself isn’t as important as the idea, but a quick set of rules for identifying these objects are:
- Mocks are objects that register the calls they receive.
- Stubs are objects that hold pre-defined data and use it to answer calls during tests. They’re often used to test networking.
- Fakes are objects that have working implementations, but not the same as production.
- Spies are stubs that record some information based on how they were called. This can mean counting the number of times a function was called, holding references to the parameters passed in, or something else .
- Dummies are objects passed around but never actually used — often just for filling parameter lists.
Personally, I usually avoid fakes because it can easily lead down a path of testing test-code instead of testing production code.
Models are usually the first place I start when writing code. They’re easier to test than most — they’re relatively isolated and self-contained. It’s easy to get away without writing tests for a model like this:
But once a mutating function, public function,
didSet, or calculated property is written on a model, we need to add tests for those functions. In addition, we should always create a testing model factory for each of our new models (
enums being the exception), even if they don’t have tests associated with them.
A model factory is exactly what it sounds like — a convenience constructor that gives the ability to create a new instance of a model with zero arguments. These are objects that exist only in the test suite.
- Factories should be an
enumso that nobody can accidentally instantiate this object.
createfunction should take all of the arguments of the model’s
init, and supplies a default value for all values (where possible). This has the dual benefit of giving us a zero-argument
initas well as a constructor that can conceivably accept any injectable or observable property on the model.
Model factories form the basis of dependency injection.
Enums do not require a factory. Otherwise all other rules apply.
There are a number of different philosophies to testing views. Most people fall into three camps:
XCUITestto test views, automating a portion of a smoke-test.
iOS-snapshot-test-caseto test views.
- Write unit tests and forget about UI tests.
XCUITest is built into Xcode and is Apple’s UI testing framework. It comes with a few handy features, like a UI Test Recorder to generate tests.
Unfortunately, XCUITest fails two of the basic rules of testing:
- UI tests written with XCUITest are slow. Testing a log-in flow can take up to ten minutes.
- UI tests written with XCUITest are flaky. They have a lopsided signal-to-noise ratio.
For those reasons, XCUITest is unsuitable for production use with automated integration environments.
Snapshot testing uses stubbed data to assert the correctness of some code. In this instance, snapshot testing records screenshots of iOS views and makes sure they don’t unexpectedly change. Snapshot test failures produce diffs for inspection.
iOS-Snapshot-Test-Case, formerly FBSnapshotTestCase, is maintained by Uber and used in a large number of production applications. Unlike XCUITest, it’s fast and far more reliable.
- We inherit from
- We set the recording mode. On first run, set this to
true, which creates a reference image on disk to test against.
- We test our current view against our reference image.
For a more detailed jump into snapshot testing, check out this article by Stephen Celis.
No UI Tests
There’s an argument to be made about the cost-value ratio of this type of testing. This type of testing is most easily caught during a QA-run or smoke-test. Assuming high unit test coverage, this type of testing may not provide much value — although I’m personally of the belief that it can prevent unexpected UI regressions, and is especially useful when upgrading SDK versions.
It’s important to test all of the responsibility of a controller. For the purposes of this article, I’m assuming that controllers have the following responsibilities:
- View presentation
- Delegate conformance
If some of these responsibilities, such as networking or navigation, have been delegated to another object (such as a
Entity), these guidelines still apply to those objects.
Testing Navigation and View Presentation
When testing navigation, a
UIWindow is required in the testing class.
- We are declaring our variables and keeping a reference to them.
- We are instantiating a UINavigationController in order to test
- We are instantiating a
UIWindowso that we can properly present controllers whose views are in the view hierarchy.
At this point, we can safely test navigation:
When testing navigation as a result of interaction with a
UICollectionViewController, it’s generally a good idea to inject the data into the controller in your tests or use an enumeration that dictates which rows/sections everything belongs to. Typing out a raw index path can lead to brittle tests.
In the case of delegate conformance, we only want to test that calling the delegate methods performs the action that we expect.
Testing this should be no different than testing any other function. We do not have to test that these methods get called when the object that holds a reference to the delegate does something.
- We are declaring our dependencies.
- We are calling the function or setting the values that we believe will cause our controller to behave a certain way.
- We are retrieving a value that we can test against.
- We are testing our assumptions with a type-check.
Notice that we are not writing redundant tests, such as:
These types of test don’t provide much value because they don’t test the behaviors that we expect — they only test type/protocol conformance.
When testing networking, it’s important to remember that we are not going to actually make a network call. When testing networking, it is standard to work against a stubbed response.
Let’s take a look at testing the successful case of an asynchronous network call:
- We are declaring our dependencies, mock, and stubs.
- We are declaring further dependencies and composing with dependency injection.
- We are setting an expectation for testing asynchronous calls.
- We enter the context of our asynchronous network call.
- We bail early if something was wrong with our set-up.
- We begin testing our assumptions.
- We fulfill our expectation
- We set our time-out for our asynchronous test.
It’s important to test the successful and unsuccessful variants of a network call. If we are displaying an alert or toast on networking failure, make sure to test for that.
Testing Other Objects
Test delegates in the classes that hold onto them. For example, if we have a class like this:
In our test class, we should make sure that the delegate method is being called when we expect.
- A mock delegate is declared as a top level class in the test target (it could be useful in other classes).
- The mocked method just updates the mock object’s state for introspection.
- We trigger an event that we believe will give us something to test our assumptions against.
- We test our assumptions.
Testing view models / view configurations is very similar to testing other models. Let’s take a look at a simple view model:
Testing this involves checking that the text, images, colors, and any other use-facing outputs of the view model are as we expect them. Of course, these tests should be written before the view model, because otherwise we are just copy-pasting production code into our tests. Green-only tests don’t tell us anything.
Quick and Nimble
Quick and Nimble are BDD frameworks, deeply inspired by Ruby’s RSpec. Personally, I believe that Quick provies a far superior testing framework to XCTest. Ruby’s Rspec is to many people — myself included — the gold standard to which all other testing frameworks are held against. Quick and Nimble replicates the behavior and usage of Rspec while keeping the syntax Swift-y.
For a more detailed look at BDD best practices, check out the BetterSpecs website.
SwiftCheck is a generative testing framework, intended to be a port of Haskell’s QuickCheck. Haskell’s QuickCheck (and thus SwiftCheck) relies on the notion that pseudo-random chaos is the best test against a function. Instead of writing out specific test-cases, we write a function that details the valid outputs of a given function. SwiftCheck generates an arbitrary set of data to test each function, and runs that arbitrary set through — testing that each output is valid.
SwiftMonkey is similar to SwiftCheck, but is intended to generate chaotic, random UI tests. SwiftMonkey suffers from the same problems as XCUITest, but is useful for discovering rare crashes in an application.
- 99 Bottles of OOP by Sandi Metz. This is a book written for Ruby developers, but every idea shared there applies generally to programming. Most notably, red-green refactoring, shameless green commits, and flocking rules.
- Working Effectively With Legacy Code by Robert C. Martin. Great book about how to deal with legacy projects that don’t have any tests.
- Pragmatic Testing, by Orta Therox (and others). Open Source eBook on iOS testing.
- An Artsy Testing Tour, a try!Swift talk by Ash Furrow.
- The Two Sides of Writing Testable Code, a talk by Brandon Williams.
- Testing an Untested App, a talk by Michael May.