Bootcamp

From idea to product, one lesson at a time. To submit your story: https://tinyurl.com/bootspub1

How to Quantify Think-Aloud Usability Testing Results

Kanika Bansal
Bootcamp
Published in
7 min readJun 7, 2024

--

In this article, I explore a methodology I successfully applied in a recent academic project, transforming qualitative observations into solid, measurable data. By evaluating a networking platform, I demonstrate how to effectively quantify usability testing results and derive impactful insights.

Numbers are powerful (even though they are often misused in user experience). They offer a simple way to communicate usability findings to a general audience. Saying, for example, that “Amazon.com complies with 72% of the e-commerce usability guidelines” is a much more specific statement than “Amazon.com has great usability, but it doesn’t do everything right.” — NN/g

My Role

This project was conducted by a team of four, where I assumed the leadership role. I took charge of the entire process, guiding my team through the usability testing and ensuring smooth execution and valuable insights.

What did we evaluate?

We evaluated a networking platform called CourseNetworking which is very similar to LinkedIn.

How is it different than LinkedIn?

CN ePortfolio is specifically designed for student’s to showcase their comprehensive academic journey by allowing them to collect, reflect on, and showcase learning artifacts from k-12 to graduate school and beyond.

A glimpse of what it looks like!

Research Questions 🤔

  • Do novice users need assistance navigating the website and signing up for a free account?
  • How easily can novice users upload their resume and create a profile?
  • How easy is it to add a community to your profile, browse and post within a community, and go back to your profile?

Participants 👥

Our participants varied in academic levels, encompassing both undergraduate and graduate students.

Tasks Evaluated 📝

Each user was assigned three tasks to complete. The sessions took place both in person and remotely, lasting 30–45 minutes each. The tasks were:

  1. Sign up and upload a resume
  2. Join a community
  3. Post a poll in the community

Measuring users ability to complete a task

We employed the think-aloud methodology, conducting sessions with 10 participants who verbalized their thoughts as they performed each task. Additionally, we recorded the time each participant took to complete each task.

Here is a breakdown of the methodology in simple words to find common patterns and trends in your usability testing and quantify the results for better articulation.

Setting a Benchmark 📏

Set a benchmark that is most suitable for the interface you are testing. For example, on a networking website, a user should be able to successfully post in a community seamlessly. On an e-commerce platform, the user should be able to place an order with just a few clicks. The success of a task is relevant to the interface being tested.

Using the “Levels of Success” Method 📊

Classify outcomes as success, success with minor issues, success with major issues, and failure. It may seem unfair to assign the same score to both users who did nothing and those who successfully completed most of the task.

  • Success: The user posts a poll in the community.
  • Success with Minor Issues: The user posts a poll in the community but forgets to select poll visibility.
  • Success with Major Issues: The user posts a poll before joining a community.
  • Failure: The user failed to post a poll.

Observe & Note Usability Issues 📝

Its a good practice to appoint a team member to be a note-taker in each usability session. The note-taker should document specific usability issues verbalized during the session and pick up on non-verbal cues such as confusion, dissatisfaction, and hesitation.

Finding common patterns

For data analysis, we used a tabular format to identify key trends and patterns. Participants were listed on the right, and the three tasks were listed at the top. We used alphabets to represent the levels of success.

a = success b = success with minor issues c = success with major issues d = failure

Next, I circulated the sheet among our team members, asking them to tick off the appropriate findings from their respective sessions. This collaborative approach made it easier to spot patterns and draw meaningful insights.

You can use a similar template— it’s quick and can be easily drawn on paper. This allows for a shared activity between team members and makes drawing insights fun and engaging!

Use this template for your next study and discover how enjoyable analyzing findings from usability sessions can be!

Articulate Findings 📈

If 7 out of 10 users posted a poll but did not select the visibility option, then 70% of users completed the task with minor issues. Additionally, I reported the 95% confidence interval for this percentage to provide a more comprehensive understanding of our data. (Also recommended by NN/g)

We used graphs to visually communicate the percentage of users at each level of success. This visual representation made it easier to understand the distribution of task performance and identify areas for improvement.

Task 1: Uploading your resume

70% of our participants could complete the task without an error. *Based on this result, we expect that between 39% and 89% of our general population will be able to complete the task.

30% of our participants could complete the task with a minor issue. *Based on this result, we expect that between 10% and 60% of our general population will be able to complete the task.

(*) The ranges represent 95% confidence intervals calculated using the Adjusted Wald method.

Task 2: Joining a Community

80% of our participants could complete the task with a major issue.* Based on this result, we expect that between 47% and 95% of our general population will be able to complete the task.

20% of our participants could not complete the task. It was a failure. *Based on this result, we expect that between 45% and 52% of our general population will not be able to complete the task.

Task 3: Posting a poll in the community

70% of our participants could complete the task with a major issue. *Based on this result, we expect that between 39% and 89% of our general population will be able to complete the task.

30% of our participants could not complete the task. It was a failure. *Based on this result, we expect that between 10% and 60% of our general population will not be able to complete the task.

A stacked bar graph to showcase findings for the overall study

Quantifying our findings not only clarified the extent of usability issues but also provided a clear visual representation of user performance. This approach enabled us to effectively communicate our insights and pinpoint specific areas that need attention for enhancing the overall user experience.

Insights Beyond Numbers 🔍

To gain deeper insights from the testing, noting down key usability issues is essential.As users navigate the tasks, make sure to note each verbalized issue they encounter. It’s like piecing together a puzzle; each comment and hesitation will help you understand the core problems and why certain tasks were failing.

For example, if 70% of users completed task 2 with major difficulties, discuss with your team the issues and errors they observed during their testing sessions for task 2. Then mark the absence and presence of these for each participant.

Here’s what I discovered:

60% (6/10) of the users felt uncertain about successfully uploading their resumes.

A common query was, “Wait, did it get uploaded?” Additionally, there were expectations for the platform to autofill portfolio sections based on the resume’s data, as indicated by one user: “Doesn’t it autofill my data from the resume?”

100% users experienced confusion regarding navigation to the community call-to-action (CTA).

Most asked, “Where is the community?”, and were unclear about the community’s meaning, asking, “What is a community?” Some attempted to find it using the primary search bar, while others used the hamburger menu.

70% users interactively clicked on the poll CTA before joining a community.

After reaching the homepage, they were more likely to notice and click the poll button before seeing the “Join a Community” CTA.

While these quantitative highlights helped us articulate our findings more clearly, we also gathered valuable qualitative insights that identified core problems in the user interface. Quantifying results from think-aloud testing provided us with measurable data to identify trends and common issues, enhancing our overall understanding of the user experience.

I hope you found this article helpful! If you have experience with similar methodologies or have unique approaches to quantifying usability testing, I would love to hear your thoughts! :)

--

--

Bootcamp
Bootcamp

Published in Bootcamp

From idea to product, one lesson at a time. To submit your story: https://tinyurl.com/bootspub1

Kanika Bansal
Kanika Bansal

Written by Kanika Bansal

I'm a product designer, passionate about solving real-world problems through design 🚀 and aspire to make a meaningful impact on billions of lives.

No responses yet