Running a Collaborative Heuristic Evaluation

5 min readMar 6, 2023

Leadership job title not required

I ran my first heuristic evaluation with a team of 5 UX professionals. I was teaching myself how to conduct one after learning the basics from my mentor and plenty of reading on the subject. I decided to take my peers along for the ride with me. It ended up being an excellent learning experience for all involved, as well as providing high quality, actionable feedback for our product owners. This is how I went about it.

Getting Started

We were a young and growing UX team who’d never conducted a heuristic evaluation before. Most of our stakeholders were unfamiliar with the term. But even without a lot of experience, I knew it was an excellent place to start for the project I had been tasked with. After a few informal conversations with my manager and peers, it was clear there was interest in pursuing the evaluation as a team. So I started planning.

The first step I took was choosing a framework. I was familiar with Jakob Nielsen’s 10 Usability Heuristics and was recently introduced to several others. I ended up choosing 2 for us to work with: Nielsen’s and Abby Covert’s Information Architecture Heuristics. Nielsen’s are a widely accepted standard and were a familiar name for my stakeholders and peers. Abby Covert’s clear and simple language made the process accessible for my team and filled in gaps of the Nielsen model.

The next step was deciding on a format for sharing our findings. There would be five of us evaluating the same user flows, and I needed one place and format for all of our data to be stored. Our team used Figma, so I decided on a FigJam board for ease of creating a standardized, replicable template we all had access to for each member to report their findings.

Calibration

I then scheduled a series of meetings, the first for getting us all on the same page. We discussed an overall vision and purpose for what we were doing, the basics of a heuristic evaluation and reviewed the structures I’d considered. We also discussed a rating scale I’d put together, a timeline and the specific user flows we’d be evaluating. This was so we’d have points to more easily build consensus around after we’d each taken a pass at the material.

Then we went our separate ways and began the work individually. A few questions popped up along the way, notably about how to rate specific usability issues. We addressed those as they arose, talking through what we saw. Sometimes I had a slightly different opinion about ratings, but I knew the idea wasn’t for me to assess through the others, it was to do this together.

The next meeting was a check-in mid-way through our timeline. We each spoke to our progress and calibrated a few points so we were working in similar patterns. Once we were done in the agreed upon timeframe, we met again and went over our results.

Building Consensus and Final Output

We found we had significant overlap in the issues found, but had each noted a number of unique issues as well. This was the reason I chose to involve peers in the process and make it a team effort. Different people will find different usability issues. Research from the Nielsen Norman Group shows that 3–5 people is a good number as there are not significant gains with involving more people.

I compiled the results into a final report and presentation for our stakeholders which included a brief explanation of what a heuristic evaluation is and my group’s methodology. It went over very well! Including that explanation was really useful in helping stakeholders understand the issues we saw and recommendations we made were not just our opinions, but were backed up by standards in the industry.

What I learned

The stakeholders for this project as well as my peers, were very open to and appreciative of me leading this process. Using industry standards in a systematized way to assess our product, and using talent already on our team made logical sense to everyone involved.

I learned that taking the time to gauge interest at the beginning of the project with peers, and then calibrating our shared understanding along the way was key to their involvement and excitement. For our stakeholders, explaining our process in a clear way, including why we did what we did and how we used industry backed standards to back up our conclusions was important to help them appreciate the value of our effort.

What I’d Do Differently

The main thing I’d do differently is to schedule more time to meet as a team. As this was our first time completing this type of evaluation, more time would have been helpful in solidifying our shared understanding of the process and reviewing our findings.

This was most evident in the way we rated the usability issues we found. Even though we’d discussed and agreed upon a rating scale, our interpretation of how severe issues were varied significantly.

Overall

The endeavor was a success in multiple ways. Mainly, we learned more about and practiced an important UX methodology, while delivering quality feedback and suggestions to our stakeholders. It also helped bring us together as a team, seemed to build investment in the work we did together, and taught me more about how groups at this particular workplace might gel together.

I’m Kim Theisen, a freelance User Experience Researcher, writer, thinker, doer, speaker, and teacher. I have close to 20 years experience in helping people understand better ways to learn, conceptualize problems, and collaborate in equity. I love to build out practices and find that aha moment in every study I do. The opinions I put forth here on Medium are my own, and don’t represent any client or organization.