How to run usability testing at scale

Sarah Hayley Armstrong
versett
Published in
8 min readJun 26, 2017

--

While usability testing is critical to designing useful, helpful products and features for our clients, it can be tough to know when and how to start regular testing.

Versett recently had the opportunity to help develop the foundation of the usability testing process at a large corporate travel company managing upwards of twelve digital products and platforms. They came to us with the following ask:

Help us build better products by setting up the right framework for usability testing throughout the product lifecycle for all products within the organization.

As proof of concept, we piloted this initiative on one of the app projects we are currently redesigning. Over the course of two weeks, we created a testing plan and script, sourced users, designed and animated a Flinto prototype, conducted user tests, and analyzed our results. Here is how it shaped up:

It all starts with teamwork

In order to best understand the process, their team asked if we could run a group workshop that would allow them to join us in moderating some of the user sessions. Excited about their enthusiasm for testing, we ran a half-day “How to Moderate 101” workshop. The workshop included:

  • A moderation crash course focussing on the think-aloud method, active listening, and avoiding bias
  • A usability script walk-through
  • A live user test recorded with Mr. Tappy and screencast to a viewing room using Zoom

Testing script & goals

While the goal for this round of testing was largely to practice running user studies, we also used it as an opportunity to do a usability study of the core flows of the mobile redesign in an attempt to answer the following questions:

  • Does this design satisfy the users’ need for increased functionality?
  • Does this design cause a noticeable barrier to migration for current app users?
  • Does this design better unify the UI with how the travel data is grouped?

Working collaboratively, we designed testing tasks that would enable us to observe users’ behaviors while interacting with as many features as possible. With a lot to cover during our first round of testing, we landed on seven interview questions and thirteen tasks.

Knowing that six of our moderators would be facilitating usability tests for the first time, we expanded the testing script to include word-by-word instructions on how to introduce the test session, ask the interview questions, and run through the tasks. With this, the new moderators managed to effectively hold sessions and capture valid user data.

Prototyping

Keeping in mind the inexperience of our moderators, we decided to prototype in high-fidelity using Flinto*. Because we were testing in high fidelity, we were able to:

  • Include fast responses and realistic, iOS animations to better reflect how a user would interact with a live app.
  • Test complex workflows, UI components, type legibility, and engagement in relation to how users interact with the current live app.
  • Free the inexperienced moderators from having to worry about making the prototype work correctly and avoid human error. This let our moderators focus on observing, and lessened their chances of rushing through or slipping up during the tests.

*Note: Flinto is normally my go-to prototyping tool. However, in using it for this round of testing we noticed a slight drawback: Flinto’s text rendering made the text appear much thinner, and may have impacted some feedback we got on legibility.

Recording tools

For recording the sessions we used a combination of Quicktime and Mr. Tappy. Mr. Tappys are external camera kits that easily attach to mobile devices. Because they record the device screen as well as where the users tap, swipe, and hold the device, they provide great contextual data invaluable for analysis.

Sourcing participants

Using our client’s user email list, we were able to source ten active users who regularly use their travel service. In addition, we sourced fifteen people outside the mobile team from our client’s company, and two people from posts on Craigslist, Twitter, and a few Slack channels.

Since these recruits either lived close to our main office or would be in the area during the time of testing, we were easily able to schedule these as in-person tests. This worked well for a number of reasons:

  • Both the testers and the inexperienced moderators felt more at ease. Interviews conducted face-to-face with target users make them feel more at ease and uncover more realistic feedback and issues. Participants tend to be more engaged and participatory. Moderators can ask reactive and probing questions in response to both the user’s actions and body language. In person interviews are also better for longer tests, ours running 35 minutes on average.
  • Richer data. The opportunity to question users in person leads to the discovery of deep insights and more potential issues. We also wanted to get feedback on our concepts that departed from the current app designs, as well as gauge emotional reaction to this departure.
  • Full team collaboration. At some points during testing, most of the mobile team were in the viewing room observing user tests. From this we saw a huge increase in collaboration and buy-in for designs and updates.

Live viewing

One of the big questions we needed to solve for this test was how to best screencast the user tests to the viewing room, making sure the live video feed was also accessible to remote viewers. We knew from experience that our best options would be Lookback or screen sharing using video conferencing software.

Upon discovering that Lookback was not our best option**, we reverted to video conferencing. Since the client’s firewall blocks Google and several other products, our options for screen sharing were Zoom and Webex — tools which the client was already familiar with and comfortable using. Between these, Webex was a clear winner, allowing us to set up day-long screencasts where viewers could enter and leave as they pleased without disrupting the test. During testing I set up my screen to show Mr. Tappy on the left side of my screen and the testing script on the right. This way, viewers could follow along in the script while I moderated the tests.

**While I’ve used Lookback before with success, it doesn’t have a native integration with Flinto or Mr. Tappy. We decided that the work-around for this was too complicated combined with inexperienced moderators and the large number of tests we would be conducting.

End results

Across eight moderators, we managed to conduct twenty-seven user tests over the course of three days. Our handoff for this round included a detailed presentation of UX findings and results, video highlights, and steps to move forward. Overall, this round of testing was really well received, and we got some great feedback from our client:

“These tests will be very valuable towards our product delivery. It’s much more effective in the growth of the MVP to stop and listen to what users say.” -Senior Manager, Mobile Strategy And Product Development

“Thanks for taking us through the prep for customer research today, the team found it extremely helpful…we definitely thank you for pushing us to the finish line this Monday. I hope you guys feel as good as we do on where we landed with the output.” -Director, Mobile, Product Development & Strategy

Notes for improvement

Although we gained valuable insights, there were areas that we believed could be improved upon for the next rounds of testing. The suggested improvements included:

  • Scale down the scope of the features and flows that will be tested in each round. For this round, we had an almost fully functional, animated Flinto prototype with complex flows and interactions. Testing with smaller, shorter, and lower-fidelity prototypes in the future would help to run testing on a quick, regular basis. More regular testing would uncover potential usability flaws in their product as quickly and early as possible, and would ensure an agile design process.
  • Decrease the number of participants in each round to an industry standard of 6. Considering the goal of running usability studies (uncovering flaws in a quick way), scaling up the number of sessions can seriously slow down the project pace.
  • Decrease the number of facilitators to the fewest number possible. Fewer facilitators would ensure that data gathering is achieved in a consistent way, and would also allow for more experienced moderators to conduct the tests.

An easy-to-implement, flexible framework

With both teams happy with the insights we gained and excited about how regular testing would benefit the quality of the product, we got to work on what was missing: a UX framework that would make user centered thinking scalable within our client’s large organization.

Centered around rapid prototyping and testing, the framework is aimed towards both decision-makers and employees at the corporation regardless of previous UX knowledge. The easily digestible framework includes:

  • A comprehensive UX thinking education primer. Since most employees at the corporation have zero familiarity with UX thinking, this educational primer focuses on teaching the importance of good design, the basics of user-centered design, and how to apply UX thinking within the organization.
  • A concise cheat sheet summarizing all the processes and decisions that are needed to run successful user studies.
  • Accompanying documents to clarify some of the points that are raised in the cheat sheet, such as: a high-level testing diagram, different methods of testing, when user studies are most beneficial in the product lifecycle, instructions on how to facilitate usability testing sessions, how/when to implement feedback, and establishing the right feedback channels.

Overall, this round of testing took an inspiring amount of effort, time, and commitment from both the client team and Versett. We’d like to give a huge thank you to everyone who was involved. Developing this framework was a great exercise for Versett to impact UX thinking in a large organization, and we’re excited to get to work on the next round of testing.

Versett is a product design and engineering studio. If you like this post, you’d love working with us. See where you’d fit in at http://versett.com/

--

--

Sarah Hayley Armstrong
versett

UI/UX Designer. Baltimore/DC Area. Senior Product designer at Tempest. Pronouns she/her.