E2E testing with Angular, Protractor, and Rails.

End-to-end testing is hard. Intermittent failures, recreating apps in a particular state, and quickly generating seed data are all problems you will encounter as you embark on the adventure of E2E testing. Through trial and error, we’ve come a long way in building a test environment for Fedora that doesn’t suck. I wish we had been able to find more advice and inspiration on setting up our tests when we first started — there is a surprising lack of good literature, code samples, and best practices for writing end-to-end tests. The examples that I’m about to give use Angular, Rails, and Protractor since that’s what we use at Fedora, but the patterns are universal and I think you’ll find them valuable regardless of your stack.

Mocking Data vs Using A Real Server

The first decision I was confronted with in building our E2E setup was whether I should run our tests against an instance of our server or intercept our API calls and reply with mocked data directly. This is a big decision, but the tradeoff is simple to understand: mocked data is easier to set up than running a server in the background, and allows tests to run faster, but less realistic and thus more likely to miss problems; running an instance of the actual server, on the other hand, is more complicated to set up and will slow down the tests, but is more realistic in terms of mirroring an actual production environment and is therefore more likely to catch actual integration problems.

I decided to run our tests against an actual server because I felt that a more realistic environment was important for catching integration problems. That said, you’ll find plenty of engineers who disagree and feel that client-side end-to-end tests should be agnostic of your server — just as the Angular E2E tutorial describes. This is simply an engineering decision that your team has to make and what’s right for one team might not be right for another. I’m happy with our decision — our front end E2E tests have actually caught quite a few back end bugs where subtle change to our API broke our interface without actually causing our back end tests to fail. In this regard, our E2E tests are truely end to end and have the benefit of testing our entire stack.

Setup

The implementation details of our setup are specific to Rails and Protractor. Skip this section if you don’t care about Rails or Protractor.

We use Gulp to set up our test environment as it’s perfect for executing a series of arbitrary tasks in the right order. When we run gulp test:e2e — specs=students/index, it automatically builds our Angular app, spins up an instance of our Rails server running in a test environment, starts an instance of Selenium in the background, and then feeds thestudents/index spec to Protractor.

The test environment uses its own database, separate from our development database — pretty standard for Rails. We’ve ended up adding a few things to Rails that only happen in our test environment to facilitate our E2E tests, but I’ll get to those later. Rails actually comes with a test environment out of the box, so we share the same environment for both Rails E2E tests (used by Capybara) and our Angular E2E tests (Protractor). This means we can share the same set of seed data for both which saves times because we don’t have to generate two sets of seed data.

State

While there is a fair amount of documentation on writing E2E tests with Angular and Protractor, I found very little advice on how to set up your stack for testing and move between states. Recall that there are two ways to set up E2E tests: the obvious (although complicated) method is to set up an instance of your server to run your tests against, and the shorter, faster version to intercept calls to the server and respond with mock data. Because we’ve decided to use the former method in order to more accurately mirror production, my examples will follow it, but both methods will depend on the same prinicpal of recreating application state.

I’ve identified four pieces to the state of a web app:

  1. Client State
  2. Session State
  3. Application State
  4. Server State

Client State

Client state is simply what state your client, and everything that composes it, is in — this includes OS, browser, cookies, local storage, and browser plugins. Using Protractor, it should be fairly easy to control this state by specifying an agent (eg Selenium or PhantomJS) and version. Aside from setting your agent in your test runner’s configuration, you probably won’t need to worry too much about client state when writing tests until you are automating your tests to run across multiple browsers and operating systems — which, by the way, is a great.

Session State

Session state is really a subcategory of client state, but it’s important enough that it gets its own section. Session describes what user you are logged in as while your tests run and how you are authenticating them. This could be setting a cookie, setting a value in local storage, or perhaps injecting some code to tell the client who to imitate. Ultimately the goal of setting your browser’s session in testing is to bypass logging in, which is important if you want the ability to recreate any state in your application on demand.

At Fedora, we’ve opted to inject a constant into our angular app using Angular’s addMockModule method. This tells our browser to add a special header X-EMAIL-AUTH=student@example.com to API requests to identify the user that we want to identify as — in this case, student@example.com. To make this work, we’ve also added some code to our server’s test environment that looks for this header and identifies the request as coming from that user.

beforeAll(function () {
setUser(“student@example.com”);
});

Another option here if you use token-based authentication and you don’t want to check for a header on the server is to seed tokens (eg, OAuth) for the users that you want to test as and then setting them on the client.

Application State

Application state describes what part of your app you are in. For example, our application state might be the Index page of Students section with a search filter for the email “chris@usefedora.com” applied and a confirmation modal open asking us if we want to delete Chris. The “facts” that each of these pieces of our UI are active make up our current state. Good application design dictates that state should be described in the URL:

/students/index?filter=email:chris@usefedora.com&modal=delete_user

By simply entering the above URL, we can perfectly recreate our state. In reality, not everything is usually reflected in a URL — for example, it’s not necessary to indicate whether or not a dropdown is open or what has been typed into a form field in our URL, but the more that can be reflected in the URL, the better.

For end-to-end testing, the ability to recreate application state through a URL is extremeley important. Because the above URL immedately recreates our state for us without having to click on anything or navigate anywhere, it is easy to test our “delete user” modal:

it(“should show the user’s name in the ‘delete user’ modal”, function() {
browser.get(“/students/index?filter=email:chris@usefedora.com&modal=delete_user”)
expect(page.modalTitle.getText()).toContain(“Chris”)
});

Without being able to recreate our state, we’d have to first set up the state by going to our app, clicking on the “Students” section on the nav menu, clicking “Add Filter”, entering “chris@usefedora.com”, clicking “Search”, and finally clicking “Delete User”. While it would be great to test each of those pieces of the app, the ideal situation is to test them separately, not as a single user story. The more of your state you can reflect in the URL, the less setup you have to do in your tests.

Server State

The last type of state we care about for E2E testing is server state, and this is unfortunately the most complicated part of our stack to set up. Your server probably has many components that can be in different states at a time, such as a relational database, a key value store, an in-memory cache, and perhaps some peripheral file stores such as S3 or a local file cache. It’s probably not realistic to recreate all of these components’ state for testing, but the less you have to mock, the more realistic your tests will be.

We’ve found that using our an actual instance of our server with mocked data and mocked third party services (eg S3) works great for testing. From here, the real challenge was figuring out how to ensure the state of our data when running a given test.

Take our students index page again, for example. We have a test to check that our search works as expected by feeding it a particular keyword — let’s say, “chris@usefedora” — and expecting it to return the correct results, which in this case would be a single user with the email “chris@usefedora.com”. But how do we know that our user, Chris, is actually in the test databse? We can trust that he’s in the seed data, but what if a previous test — let’s say our test to check deleting a user, deleted Chris?

My first take at solving this problem was to make a rule that a test should always clean up after itself by leaving the database in whatever state things were in before it ran. This worked for awhile, but I ended up with tons of setup and takedown code that was hardly even specific to the actual test itself. Over time, I started to find this approach was problematic: sometimes it wasn’t possible to put things back how they were; timestamps, ID’s, and randomly generated tokens cannot be controlled and are at the mercy of the server, and if you delete something that has subresources (such as a user), you’d end up having to re-create all of it’s sub-resources every time. What a pain!

The final solution was to clear out the database and reseed it with fresh data between each test. It took many attempts to get this to work with Rails, but we eventually got it working and it was well worth the struggle: every test can now trust that data will be in an exact state and we can write tests that don’t intermittantly break depending on how previous tests changed the data. This means no more setup and cleanup code too, and thus, more readable and faster running tests.

Our setup code now looked like this:

beforeEach(function() {
reseed(“students/index”);
});

We added an argument to our reseed function so that we could pass it the name of individual seed files in addition to reseeding with our base set of data so that each test could have its own seed file if needed. In practice, most of our tests only used our base seed data, but occasionally we wanted some extra data just for a single test.

Under the hood, the reseed function made a call to our server’s /reseed endpoint which we added to server’s test environment to reseed the database. When Rails receives a request to this endpoint, it:

  1. Destroys and recreates the database
  2. Seeds the database
  3. Applies any seeds specific to this test
  4. Clears and reindexes elasticsearch
  5. Empties our session store

While we decided to implement reseeding by adding an endpoint to our server, but this could have just as easily been done by directly connecting to the database or using a shell script.

Combining Everything

Over the course of a few months of experimentation and tweaking, our tests eventually ended up looking something like this:

beforeEach(function() {
reseed(“students/index”)
setUser(“student@example.com”);
page = pages.students.index;
});

We had come a long way, but I wanted to simplify things and make them easier to read. Following the Rails community’s obsession with delcarative syntax, I rewrote our setup as:

setup(“/students”, { as: “student”, with: ”students/index”, using: pages.students.index });

And voila… every test could now be set up with a single line of code. Our entire stack — Postgres, Redis, Elasticsearch, Rails, Chrome, and Angular, are all in an exact, known state and ready to be tested. In fact, some of our tests are now just two lines: a setup call and an an expect statement.

describe(“students.index”, function() {
it(“should show ten users”, function() {
setup(“/students”, { as: “student”, with:”students/index”, using: pages.students.index });
expect(page.students.count()).toBe(10);
});
// …
})

Lessons Learned & Best Practices

In the process of gradually making our test environment easier to use, we developed a few conventions to work better than others.

Ordering your tests so that you can skip reseeding

It’s generally considered a best practice to write tests that don’t depend on other tests to run first — take the example above of deleting a user causing a later test to break. Recreating our state between tests solves this problem, but it also takes a second to reseed the database and reload the browser, which slows down tests. Instead of religiously reseeding between every test, I’ve found that it’s acceptable to only reseed between groups of tests (eg a describeblock), as long as those groups are still modular and can be run in isolation. This ensures that you can still mark adescribe test as pending (xdescribe) or focus it (fdescribe) without breaking things.

False positives are bad, but false negatives are worse

Tests are part of a team’s workflow, and that means they are part of your team’s culture. If tests intermittantly fail when they shouldn’t, you’ll find your team disabling them (commenting them out or marking them as pending) in order to get around them and continue working, which of course negates the point of having them in the first place. It’s better to have a test that occasionally misses a bug than to have a test that occasionally fails when it shouldn’t.


If you found this post interesting, check out cjroth.com and follow me on Twitter.


If you like the way we think and think you might have something to add to our team, please reach out. We’re hiring!