On my job at a digital agency, I was given a task of “doing gamification” for one of our clients.
After trying out (and failing at it) a couple of existing open source solutions, we decided to implement the gamification engine (GE for short) ourselves in-house, using Symfony and Doctrine. Other implementation details should (and, decisively maybe will) be granted a post or three of their own. In the meantime, let’s focus on the topic at hand: testing the app.
Edit: it actually happened, you can read all about Gamification implementation.
What is it and why should I care?
Symfony already has excellent support for testing which is a great start for your testing needs. They utilize the battle-tested PHPUnit testing framework and give you the ability to run your Symfony app through the PHPUnit test suite. We can roughly differentiate tests by type: unit tests and functional tests.
I’m not going to explain what testing is in-depth: if you found yourself here, I’m guessing you typed in some quite specific search terms in Google. Some of them might even be “testing”, “symfony” or “applications” while my expectations for “synergy” or “Beyoncé” are somewhat lower.
True unit tests will test only one method at a time, not the whole class, not the object integration, most definitely not the whole app.
Unit tests are low level, super fast, precise, but you need a lot of them.
Pro: fast to execute, precise when they fail (easy to backtrack from the test failure to the bug), mostly easy to write, ability to extract code coverage when running.
Con: as they cover only one method (or more precisely, one code execution path) at a time, you need a lot of them to cover the whole app. Also, if two objects work for themselves (as evidenced by tests), that does not mean they’ll work properly together, we need higher level testing for that, for example functional tests.
True functional tests will run the app, pass the inputs and inspect the outputs, they will not inspect the internal state of the app or anything like that, the process we call black-box testing.
Functional tests test your app exactly as a user would use it: slow and imprecise, but you need only a couple to break it. When they do break it, they can’t clearly explain what happened, just that it did.
Also, they send you screenshots as Word documents.
Pro: you can cover most of your app with just a couple dozen tests, writing them is easy in a “for this input, expect this output” kind of way, you see the whole system working at once, instant gratification.
Con: really slow to execute compared to unit tests (minutes or even hours compared to seconds), much harder to initially setup, results are a lot rougher as you cannot directly pinpoint which line of which method failed, as with unit tests, it’s almost like “Yeah, the thing you wanted? It failed.” type of feedback. Still a better love story than debugging by hand.
That was boring, let’s test
So far this has been like a high-level chess game: 20 moves of theory, then you blunder and go home. Our blunder seems to be on schedule so let’s dive into nitty-gritty details of functional testing a Symfony app.
“Request. Inspect the response. Rinse and repeat.”
Our first test is a very simple case of making a HTTP POST with a certain JSON payload and verifying that our app responded with HTTP 200 OK which brings us to the first important tip:
Always inspect the HTTP status code.
The code is as simple as expected.
The side-effect of being (a real-world application)
The above example is short and sweet, but basically it’s a unit test with one very large unit called “the application”. It works if you’re testing a pure function, but for most of real-world usage, applications have side-effects.
In computer science, a function or expression is said to have a side effect if it modifies some state (…) [it] might modify a global variable or static variable, modify one of its arguments, raise an exception, write data to a display or file, read data, or call other side-effecting functions.
The side-effect we’re most often looking at is called the database. We’re not afraid of them, we’re actually interested in seeing them happen, the “side effect” here is the main reason for the existence of our app. How’s that for an existential crisis, Hamlet?
We’re using the excellent LiipFunctionalTestBundle which does most of the plumbing described in the Symfony cookbook, the killer feature here being using SQLite as a database storage. With this setup, you’re able to fetch the code and run the tests without setting up the database at all. It will:
- create a database for you (or remove the previous one)
- setup the database schema from your ORM model
- load your fixtures (the predefined data on which you run your expectations on)
and do everything listed really, really quickly, using the fact that it’s running your app on top of SQLite to its advantage. With it setup, you’re again able to send a request and inspect the response, but now your application can SELECT and INSERT and do other crazy stuff. And life was good.
The impure functional testing
This is a step in the right direction, but we’re not there yet. As you already might be thinking, you cannot gauge the complete correctness of the application just by looking at the responses it gives or it’s quite complex to do so.
For example, your request might need to increment a variable and it does so without error, only it’s now set to 14 instead of 13 and, if you don’t explicitly state the value in the response (which you might not), you’ll miss it and have a nasty bug in your “tested” code. We need to look at exactly what’s happening by querying the database from our tests. This goes against the principle of black-box testing, but I’m against principles after which I’m in a worse situation than without them, such as this one, or returning books to libraries.
We start by subclassing Liip\FunctionalTestBundle\Test\WebTestCase and adding a method by which we assert against the database:
with our test now looking like this (note, the user exists in the database because it is created from the fixtures):
This approach enables us to inspect the database prior and after the test being run which makes us certain this particular side effect works as expected.
SQLite vs. the (real) world
“That’s great”, you’re thinking, “but my application with 10k simultaneous users isn’t running on SQLite. Well, not anymore, ever since The Incident.” and you’d be right. What good is it running tests on one RDBMS if you’re targeting another? Answers are: speed and portability.
With the “cached database” feature LiipFunctionalTestingBundle offers, SQLite will run your tests sometimes up to 10× as fast as it doesn’t need to flush+recreate+pre-fill the whole thing every time, it just copies the prepared file from cache and goes from there.
Run tests on top of SQLite and, only if they pass, run them on MySQL.
If they fail, I’d rather they do it after 30 seconds than 3 minutes.
With supporting another RDBMS, you’re in a position to change it with little change to the app (you using an ORM means you’re 98% there anyway, why not go for 100% platinum badge). It might not have made sense before, but now in a fast-moving, cloud-wielding world, this is a worthwhile opportunity. For those parts of the app that are unable to run on SQLite without really crazy tricks, you can always skip those tests. Also, it’s cool to be able to switch the DB system and see if your app could run on it.
Running on top of MySQL
We start off by introducing a new Symfony environment, runtime_test, with this configuration file:
and alter our FunctionalTestCase class, adding support for using this:
Running our tests now becomes really simple:
Note: in our case, the RDBMS switch is not the only difference. As we’re using Gearman for our app, the runtime_test environment also switches on running the tests through Gearman workers synchronously. Detailed explanations how to set this up are outside the scope of this article, but other tweaks similar to this are also possible.
Running on production
With the same logic as “why run on SQLite when targeting MySQL” we can reason “Why run on development machines if you’re targeting production?”
You don’t care whether it works on your development machine,
does it work well on production?
The problem here is that you almost cannot guarantee that you can replicate the production environment down to a single detail, it might have a slightly different underlying system, a slightly different network setup, even a different updates applied might mean it works for you, but not on production. Our app had the problem of not having the same MySQL version as production and, while it did work everywhere we tested, it did NOT work on production due to differences in SQL mode setting, NO_UNSIGNED_SUBTRACTION to be concrete.
With this reasoning, can we run our tests literally on production? Turns out, yes. It’s scary, but yes.
Running database-trashing tests on production, right next to your
production database is brown-underwear scary.
The idea is that you obviously NOT run on a production database but to choose a different database. Also recommended is that you do NOT use the same database user as for the production: create a new user and give him access only to the test database. This should somewhat lower the murphyesque certainty of your tests, along with the bass, dropping your production database.
The problem with IDs
Turns out, running tests on production systems means that they’re not running on the same setup as your development or even staging machine (that was the whole point). Let’s say your production is set up in a master-master replication which, among other things, means auto-increment fields don’t go like 1, 2, 3 but instead go like 1, 3, 5, etc. This means, if you’re exposing IDs in your app’s responses and testing against those payloads, your tests will fail. This brings me to the next piece of advice,
Don’t hardcode database IDs in your testing fixtures / payloads.
To avoid this, Doctrine fixtures functionality allows you to store a reference to a previously-defined fixture and fetch whatever ID it was awarded. We use this to our benefit and switch our payload fixtures to PHP arrays:
This means you store references to all objects whose IDs you plan to expose in application responses and just fetch the ID in the payload:
We went from not having any tests to running tests on our production setup which gives us almost complete certainty that the application runs as expected. This means we’re able to add features quickly without being afraid of breaking existing functionality, while avoiding all the pitfalls of testing in once place and thinking that means it works in another.
The concept is just a first draft and I’m sure it’s flawed in at least one way. He who finds a way, will want to do something about it, like add a comment or three.