The software tester’s kryptonite: test data. In a world where applications can serve millions of customers, it’s impossible to confidently say “we’ve tested everything”. Here’s the story of one approach I used to increase our coverage confidence.
Testing software can feel like a daunting, never ending task. As testers, we learn quickly that it’s impossible to test every scenario, every corner of an application, and uncover all defects. I have worked on applications that vary a lot in complexity. Some apps are simple: their functionality is set and every user consumes it as-is. For example, an application that shows your flight information on monitors at your gate. Other apps have a limited user base and are built for a targeted audience, which allows for testing to hone in on fewer user personas. These apps are more common. The most common, though, have a challenging user base: everyone. Think Facebook, Twitter, Target or Best Buy’s eCommerce stores, mobile banking apps, and (drum roll intro for this story) airlines’ online check-in applications.
To give perspective on the vastness of the data permutation problem we faced on the particular project I’m going to talk about, I’m laying out a couple examples of real systems that deal with relatively large varieties of data in different ways.
Think of a retailer like Target, Best Buy, or Walmart. These retailers carry thousands of products. Some products like clothing come in different sizes and colors. Some require add-on items (like a laptop coming with a Windows install). These companies also offer services and warranty plans. Combine these with a user base of anyone who shops online and the way an online order looks will vary — a lot. You have store pickup orders, delivery orders, and now with COVID, curbside pickup. There are subscriptions and pre-orders to account for. Despite the data varieties here though, the variations they deal with are part of the system itself: not necessarily of its user base. If you go to place an order, you are adding items to your cart and choosing your fulfillment options within the defined bounds of their system.
Now, think of your bank’s mobile app. You can deposit a check, transfer money, and check your account balance. The most obvious scenarios that are tested for banking systems are of course differing dollar amounts. Throw in customers that also have a credit card and savings account. Business or corporate accounts that have large deposits and withdrawals multiple times a day, small accounts that have a deposit every few months, and accounts that auto-deposit. Similar to the previous example, there are a lot of data scenarios to think about, but it is still bound to this system’s set of rules (like daily deposit limits).
Finally, the real example of the problem my team faced: not one, not two, but three major airlines joining forces to allow seamless check-in and boarding pass generation for passengers. Our customer base is: any passenger of any of these airlines traveling to or from one airlines’ destination to another, with any amount of stops in between. Neat! Any air traveler in the world is now perfectly viable to become one of our users. For any airline, in any part of their system, difficult challenges are faced with passenger data. As a passenger, I can book a one-way domestic flight, by myself, from origin point A to destination point B. Great. Simple. Except… not, because if you have ever traveled by air, you have most likely had a different itinerary than I just described.
Our Data Challenge
One passenger, two passengers, 7 passengers. 9 or more passengers. Group travel. Single segment trips (point A to B). Round-trips. Connections! Domestic flights, international flights. Traveling with your spouse and infant from Canada, picking up grandma in Florida, continuing on to Mexico. Going to Europe for 6 months! Traveling in and out of countries that require visas. Corporate travel. Being eligible for an upgrade! Being eligible for an upgrade when your travel companion isn’t! Upgrading to First Class :) Declining an upgrade :( Being a miles club member (or not). Traveling as an unaccompanied minor. Traveling with a peanut allergy, wheelchair, vision impairment. Checking your bags. Not checking your bags. Checking bags with a special item. Originating your trip at an airport that requires earlier check-in when you have checked bags. Paying with miles, paying with a credit card, paying with a reward certificate. Traveling in Main Cabin, or Comfort, or First, or…
Any of these things — combined, in any way — are fair game for scenarios that an airline’s check-in system must account for. These systems have some intense logic that powers them, systems that have been in use for years. Our challenge: how are we, as software testers supposed to cover an acceptable fraction of these scenarios?!
You Can’t Test Everything
Luckily, we were able to pull metrics that helped guide our testing journey. We had reports that told us the most common types of trips we processed and the sizes of travel parties. We could see the most common destinations, and whether people were checking in online, with the app, or at the airport. Using this information, we could target our testing to ensure confidence that the check-in process worked for a majority of our traffic. This became our core set of automated and manual regression tests. I highly recommend taking a similar approach if your application under test has a large variety of user scenarios.
Non-Traditional Use of Test Automation
Even with these tests, though, our partner airlines were still finding issues that passed through our automated/manual testing, and often were missing from our requirements entirely. After a few rounds of triaging their defect reports, we concluded that a commonality among them was that they were scenarios that we had either never thought of or did not know were possible. Airline copy data from production to populate test systems, and our partners were taking advantage of this to mine random data and see how our system reacted. Smart! We could do this too, but it gave me an interesting idea…
For most* of our bazillion different user scenarios, we know this to be true: each passenger should be checked in for their flight and receive a boarding pass. Similarly, for any international trip, each passenger needs to provide their travel documents, check in, and receive a boarding document. Our automation framework’s modular approach meant that we could build out the aforementioned scenarios very easily, for any amount of scenarios that we wanted to test — I’m talking ~5 minutes of effort. We data-drive our tests from a simple csv file, providing a record locator and origin city to look up the trip. So, I created these basic tests (check in, get boarding pass) that we’d feed randomly mined test data into and just see what happened!
It was simple and fast: find any random flight, copy the record locator and origin city into our csv, and run the tests. You could run as many as you wanted. We started catching the same random scenarios that our partners were finding. We found new oddities, and were able to increase our confidence that we had adequate test coverage. I talk about this approach often, as I think it’s a great example of how automation can support manual testing in non-traditional ways. Taking a deeper dive into the problem at hand, then tailoring the automation approach to best solve that problem can go a long way and make everyone’s lives much easier!
*In some cases, a boarding pass is not generated, and the passenger needs to see a check-in or gate agent to provide further information to obtain one. This would fail the test, but was rare enough that we could quickly pick out these expected failures.
Read more about this project on Delta’s news hub.