UX Learnings: Why should I measure my product’s user experience, and how?

Published in

Bootcamp

7 min readMar 18, 2022

KPIs (Key Performance Indicators) are important to know for everyone working on a product, including UX researchers. It makes sense to keep overall business goals and milestones in mind when striving for a better user experience. In turn, a better user experience should naturally help the business reach its KPIs. But how do we know we are achieving a better user experience?

In UX research, there are various metrics that can be used to assess aspects of your product’s user experience. These metrics are not a ‘use once and done’, but should be tracked consistently over time. This is important because you will have evidence to support whether specific changes to your product have improved or worsened the overall experience. This evidence not only helps your PM (product management) and engineering teams to prioritize what to tackle for the next quarter, but also gives leadership a summary of scores they can easily refer to when assessing how well the product is doing.

Benchmarking results also show team members and leadership the value of UX research, which means a higher likelihood of bigger research budgets. Bigger research budgets means hiring more researchers (giving you the chance to build out an established department if UX research is fairly new for your company). Establishing research best practices and having more advocates for your department mean the more likely your non-research team members are to listen and take action on your findings.

There are two different types of occasions on which you should capture these metrics. 1) for any one-time moderated/unmoderated usability test, 2) for your overall product over time. The latter is called benchmarking. It can take more time and money to benchmark your entire product properly and consistently, but it’s worth it in the long run. In addition, once you’ve set up and run through one successful round of benchmarking, it’s much easier to keep most of the same setup and simply tweak for future rounds.

There are many useful metrics out there — some trustworthy knowledge sources include the Nielsen Norman Group and the Interaction Design Organization. How do you choose?

Look at a list of your company’s KPIs. Which KPIs/OKRs are tied directly to your product’s user experience? (i.e. increase monthly active users by 15%, increase lead conversion rate by 20%, maintain use safety at 100%)
Look at a list of your product’s OKRs (Objectives & Key Results). Not all OKRs are objectively measured. For example, how do you measure user happiness with your product or likelihood to recommend your product to others?
For each KPI/OKR that is directly tied to your product’s user experience, what are the relevant tasks users need to complete? (i.e. increase monthly active users by 15% for Instagram Reels→ Task: Log into website)

List of post-task metrics:

*not comprehensive, you may have some research objectives that require other metrics!

Success — Is the user able to complete the task successfully? (Y/N)
Perceived Success — Does the user believe they have completed the task successfully? (Y/N) How does success compare to perceived success? If they failed at task but believe they succeeded, this is a ‘disaster’.
Perceived Ease of Use — Overall, this task was? (1, Very difficult to 7, Very easy)
Verbatims — Remember that you want to capture not only the issues, but the reasons behind the issues. This means you should always include open-ended questions after posing metric questions. For example, ‘Why did you select the answer in the previous question?’ to understand why your user ranked a task as 1, very difficult.
Attempts — How many times does the user start the task over before succeeding/failing?
Time on Task — How much time did the user spend on task before succeeding/failing? This one can be tricky to interpret because just looking at the number of minutes/seconds a person spent on a task doesn’t mean anything without context. For that specific task, what’s the ideal task on time for an experienced vs. non-experienced user? You can come up with an estimate by adding up average time for each little action that is necessary for successfully completing the task. Here’s a free online Keystroke Level Model (KLM) with average calculated values for system time on tasks — explanation and online tool.
Perceived Time on Task — Does the user perceive the time they spent on a task to be too long? Is this a factor in their frustration while completing the task or in their perception of the overall product?

You likely have multiple features in your product that require different sets of tasks for reaching the KPI/OKR. You’ll have separate post-task metric scores for each feature. But you also want scores for your overall product. See below for some options.

List of overall system metrics (ask at end of all your tasks):

System Usability Scale (SUS) — 10 questions with choices of 1, Strongly disagree to 5, Strongly agree.

Average SUS score is 68, above 80 is considered excellent. Interpreting the SUS score can be complicated because you have to normalize the scores to get a percentile ranking. If you don’t want to go through this complicated procedure, you can rely on platforms mentioned above (i.e. UserTesting, etc.), which will calculate the percentiles for you.

I think that I would like to use this system frequently.
I found the system unnecessarily complex.
I thought the system was easy to use.
I think that I would need the support of a technical person to be able to use this system.
I found the various functions in this system were well integrated.
I thought there was too much inconsistency in this system.
I would imagine that most people would learn to use this system very quickly.
I found the system very cumbersome to use.
I felt very confident using the system.
I needed to learn a lot of things before I could get going with this system.

User Experience Questionnaire (UEQ) — 26 contrasting attribute pairs

Source: https://www.ueq-online.org/ (Free PDF download of UEQ in 30+ languages)

Happiness Tracking Survey (HaTS) — 5–7 questions, quantitative and qualitative

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43221.pdf (Free PDF download of HaTS questionnaire)

Once you’ve written out your benchmarking research plan, what next?

Decide how you’ll get your tasks and questions in front of users. If you don’t have a base of already recruited target users, you will need to purchase a subscription to platforms like UserTesting, UserZoom, d-scout, etc. Do your research to choose a platform that works best for you and your budget/research objectives. On these platforms, you can launch your screener + tasks/questions, and users who have signed up as ‘test takers’ on the platform will take your screener. Whoever qualifies is automatically directed to your actual tasks/questions. However, if your target audience is relatively difficult to find in general panels (i.e. physicians, c-suite executives, etc.), you may need to hire a recruitment vendor. Vendors are more costly, but are more rigorous in making sure you are talking to exactly the right types of users.
Set an appropriate sample size. Remember, a benchmarking test requires quantitative metrics, which means you need a larger sample size to establish statistical significance. You need a minimum of 20 users for a margin of error of +/-19%. You need a minimum of 71 users for a margin of error of +/-10%. 20 users should be enough. (Analysis done by Nielsen Norman Group.)
Do a pilot test of your benchmark before launching it to all. Make sure there are no bugs. Because it’s an unmoderated study, you can’t ask follow up questions to users in the moment. You also can’t be there to guide users to the next task if they can’t get through the previous one, so make sure to include additional instructions.
Invite your product stakeholders for a ‘watch party’ of your pilot test. Having them watch a user go through the benchmarking for a specific feature, etc. may prompt them to think of different questions/tasks/metrics you need to capture. It also increases their enthusiasm for the final results once they see users struggle through a task.
Publish your study!
Analyze results. Calculate metrics by feature level and system level.
Share out results with high visibility. Research results don’t matter unless your stakeholders will actually use them! Keep in close communication with your stakeholders from study inception to understand what will be most useful to them from the results. Create short video clips that highlight major user pain points, to foster empathy for your designers. Help your PMs and engineers prioritize which issues to tackle by giving them an organized list of all issues, and taking them through the results. Create a summary scorecard and data visualizations for executives to refer to when making product decisions.
Run the benchmark periodically as necessary. You don’t have to run the entire benchmark study for system level more than once a year. However, if the team is releasing an important feature addition that will impact other features, you may want to run part of the benchmark just for that feature area. It should be easier to do, now that you’ve set up the whole process from beginning to end and run through it once! Remember to meet with stakeholders before each round of benchmarking to keep up to date on product changes and shifting OKRs, KPIs.

UX Learnings: Why should I measure my product’s user experience, and how?

Written by Janine Kim