Pivoting to Weekly User Testing: A Blueprint for Success

Published in

Animoto

11 min readFeb 5, 2021

In graduate school, my advisor gave me one of the most important pieces of research wisdom. After a hiccup during one of our studies, she shrugged and said “The course of science never runneth smooth.” It’s a phrase that has stuck with me throughout my career and I mentally repeat it to myself anytime there is an error in a prototype, or I flub a question during an interview. As researchers, we always strive to be perfect, rigorous, and repeatable, but sometimes you’ve just got to find a way forward to best get research to meet the needs of the team. This was a situation I recently experienced as I worked to pivot the research process from its previous form to a rapid, weekly testing cadence. Oh, and I am the only researcher currently on board.

My name is Mike Montaño and I work at Animoto, a video creation tool that small business owners and consumers can use to easily create videos to market themselves, and their products or content on social media. I’ve been at Animoto for nearly four years, and have been the lead UX researcher for nearly the past two and a half. I have a master’s degree in experimental cognitive psychology from Auburn University and have worked in the UX field for almost eight years.

While at Animoto I had fallen into a comfortable routine of running in-person moderated usability tests on a two-week (possibly three-week depending on the constraints and research question) cadence from kickoff to readout while also juggling other requests, consults, and smaller projects. However, in October of 2019 Animoto as a company and the design team specifically realized that there was a new chance to move the product forward as a whole.

The design team identified an opportunity to fully redesign the core experience and user interface of the Animoto product. This would be a chance to drive the user experience and provide functionality that would empower users to be able to make more eye-catching videos at a significantly faster pace, optimized for various platforms and aspect ratios. It was a huge project that would require Engineering, Product Management, and Design to all crank harder than ever before, in concert with one another to get the product launched into beta by the second quarter. With so many new designs and interactions available to users a new approach to rapid research and testing was going to need to quickly kick into gear. Let me tell you what we did, what went well, and what we sacrificed to run effective weekly testing.

What Did We Do?

After taking some time to work closely with the designers to define the best approach to the testing structure, we were able to settle on a plan that would allow for rapid iterative testing as a result of fast recruiting and group synthesis. Recruiting would begin on Thursday and was accomplished through a subscription with UserInterviews.com. The new recruit would begin and then continue for 24–48 hours while potential participants were filtered onto the dashboard by the service. On Monday, I would meet with the team to sync on the testing needs for the week. If there were early prototypes or general sketches we could walk through them at this point so I could begin thinking about what the moderator’s guide might look like and start working on an outline.

I would then log back into UserInterviews.com to approve qualified participants from the pool of surfaced options. Four slots were opened throughout the day on Wednesday and approved participants had the ability to self select into whichever time slot worked best for their schedule. New potential participants were always being added to the pool after the first 24 hours so I would frequently return to the dashboard to approve new participants in the hopes of filling all four of the time slots before Wednesday.

By Tuesday, the designers had finalized their designs of the prototypes that were to be tested on Wednesday. So, we would meet again to step through the process and sync on all the questions they hoped to have answered during the session. At that point, I would flesh out the draft of the moderator guide with the new information from the prototypes and questions from the designers and send the guide out to the group for any feedback and comments. Ideally, the time slots would be filled by this point so I would send out the calendar invites for the sessions to the team so they would be able to listen in. I’d send out reminder emails to all the participants with troubleshooting instructions and my contact information so they could reach out if they encountered any difficulties logging into the remote sessions.

On Wednesday morning, I would send out a blank document that would be used for notes and ask for volunteers to take notes. For each session, there would be at least three people on the Google hangout: myself, the participant, and one designer to take notes during the session. I would refer to the designer as a research colleague at the outset of the session to avoid biasing the participant to only give positive feedback to the creator of the designs. Additionally, all sessions were recorded using Quicktime with the participant’s consent. Each session lasted one hour and were scheduled such that there was a thirty minute gap between each to allow for potential spillover due to late arrivals, technical difficulties, etcetera.

After the sessions were completed the entire team would meet to debrief. This group consisted of the product managers, product designers, engineers, and myself. A note taker would be designated at the beginning of the meeting and I would step through the prototype with the team and call out key findings and consistent feedback from the participants at each turn. After my initial readout, I would open the floor to other team members to see if they had other feedback about what I had called out or additional findings that they had seen during the sessions so we could discuss them as a group. At the end of the debrief the team would discuss next steps and how to take action based on the results of the sessions and would begin to work on new iterations to cover next Monday. On Thursday, I would spin up another round of recruiting on UserInterviews.com and the process would begin again.

What Went Well?

There were three main techniques and approaches that allowed this weekly testing cadence to work for the team: supercharged recruiting, over-communication between product design and UXR, and a streamlined moderator guide scripting process.

Recruiting

This entire section will inevitably sound like a paid advertisement from UserInterviews.com, but I promise I am in no way being compensated for it. Most researchers dread recruiting because it can be an arduous and frustrating process while simultaneously being wildly time consuming. When the team was in the early stages of planning weekly testing, I made it clear that a weekly cadence would only be possible if we were able to find a way to significantly reduce recruiting time and be assured that the reduced time would be replicable each week. After some investigation and trial runs with different approaches I brought UserInterviews.com back to the team. Luckily, at Animoto research is highly valued so I did not have to argue particularly hard to get buy-in from the team and executive level stakeholders. After crunching the numbers, it was clear that we would be recruiting and running enough sessions that signing up for an unlimited monthly membership with targeted recruiting based on the participant’s job, since we were after a specific participant type, would make the most sense. The consistency, both in speed of participants being surfaced to the dashboard and in the caliber of participants who took part in the sessions was incredible. Without such a powerful pipeline of reliable participants this weekly testing plan likely never would have gotten off the ground.

Over-communication

One of the most important parts of working with teammates is learning how to communicate effectively. While it was always important to stay in touch when working in the office together, the need to communicate clearly and frequently ended up becoming even more important once we were all physically apart from one another during the pandemic. There were certain weeks when the prototype was being finalized in the hours before the test was set to run and times when the moderators guide was being tweaked right before the first session was set to begin. Stressful situations like that can often lead to mistakes and miscommunication so the team really made sure to focus on over communicating throughout the week and especially during the lead-up to the first session kicking off. Using shared tools like Slack and Google Docs, we were able to very clearly see the status of various deliverables and inform one another when new content was ready to be reviewed by the team. Working together to streamline communication while also being flexible in this newly remote world we were existing in allowed us to get into a rhythm and hum like a well-oiled machine.

Scripting

This was an area where the weekly testing really provided me an opportunity to grow and develop as a researcher. In previous testing situations I have preferred to focus on getting all of the details perfect before sharing out a moderators guide for review with team members. I certainly know that no first-draft is ever the final draft but I often would let perfectionism get the better of me and spend too long tweaking everything to get it just right. With the shorter deadlines in place, I needed to move beyond this habit and work more effectively within the reduced time frame. This meant ceding some of the control to, as mentioned above, communicate earlier and often. Functionally, this required sharing the moderator guides out for feedback at earlier stages than I was used to. This ended up being a boon for the research and for my development as a researcher because I was able to learn to let go a little bit and not focus on attaining perfection with a first draft, and also allow for more time for the team to provide feedback about the guide itself by providing it earlier. I now find myself to be less tight-fisted with the first draft than I used to be coming into weekly testing.

What Did We Sacrifice

In order to get into the weekly testing cycle, we had to make some sacrifices. There were three main areas that the team came together and decided on in order to reach our weekly testing goals: the number of participants in each round, dedicated analysis time, and our shared cortisol levels.

Number of Participants

With our schedule set up the way it was, there was only one day allotted for the actual testing process. Given that there was only one researcher and only so many hours in the day we decided to use only 4 participants in each test as opposed to the industry standard of five. Now, I can almost hear people already complaining that they’ve run more than five participants in a day and I certainly understand — I’ve done it myself. However, given that we were going to be doing this for an undetermined length of time we wanted to make testing as sustainable as possible and capping the participants at four was one step we took to meet that end. Looking at the famous Nielsen Norman graph you can clearly see that four participants are still providing a wealth of information and data. We further justified this sacrifice with the understanding that we would be running significantly more tests on given parts of the designs. Why do researchers not test with ten participants instead of five? Because the ROI on time invested is better served by doing two tests of five. We decided to lower our participants to four but then do significantly more iterative tests thereby uncovering the same insights over time that would have been identified with one test with five.

Time for Analysis

Given that the point of the weekly testing was to provide frequent data to the designer who could then take those insights and iterate on the previous designs there was simply not time for an extended data analysis period. A full day of affinity mapping was a day less for the designers to be working on new ideas. In order to keep the process as streamlined as possible, a debrief meeting was scheduled for the end of the day after the last session ended. During that meeting, one person would volunteer to be a notetaker and UXR would step through the prototype and the moderator guide and call out the main takeaways where the majority of participants had issues throughout the test. After identifying the primary key insights, the floor was opened to the rest of the team who had been observing all of the sessions as well. This was an opportunity for everyone else to add in any other insights that may not have been included in the original rundown or areas where there had been differing impressions. This allowed the team to get on the same page about what the main takeaways and key insights were and identify next steps for the next iterations on the designs. It was a very collaborative and open space that allowed for the team to effectively highlight and prioritize the key findings and ended up being an illustrative showcase for the collaboration between product design, product management, and UXR.

Cortisol levels

The title of this section is a bit of a joke but if you can’t tell based on how I’ve outlined what our process looked like, allow me to be clear: doing testing at this cadence can be (and was at times) incredibly stressful. You need to be able to recruit participants for a test that will cover a prototype you haven’t seen yet, create a moderator’s guide based on incomplete designs, deal with some degree of uncertainty coming right up to the day of the test, run a marathon of testing in one day, analyze and summarize the key insights at the end of the day and then turn around and begin the process again the next day. In order to try to fight the impending stress it is incredibly important to plan effectively and find a process that works well for you. Finding a rhythm that works well and is sustainable will help you be able to continue working at this clip. Finally, assembling a team that communicates openly and often throughout the week will absolutely make this process significantly less bumpy. Remember what my advisor said: “The course of science never runneth smooth.” Be open to making mistakes, but more importantly take action to learn from the mistakes you make along the way. Finally, hold on and get ready for an exciting ride if your team decides that there is a need to start weekly testing and you’re the sole researcher at your company.