Tasks for Treejack Tests

Five essential questions for testing navigation

Testing navigation entails giving participants a task and a menu and directing them to complete the task with the menu. You’re essentially asking, “Where would you find X in this menu?”

A task and a menu. Give participants a task and ask them where they’d complete the task in the menu. Collect data about what the click on and what they nominate as the right location. Now you’ve done a navigation study.

Results show whether people found the right category, and how much they had to bounce around to find it. You also see unexpected things, like a large number of participants consistently looking to the “wrong” category. Or participants regularly using one category across different tasks when they didn’t know where else to look. All these results help you refine the navigation.

I usually format tasks like this:

Where would you find an answer to the question, “How do I contact customer service?”

This way, it’s clear that the point of the task is to find a location, and the right location should provide useful information.

So, if the structure of the task remains the same, what should you ask about? I draw from five different patterns of tasks.

1. The Essential

Tasks in the user’s voice
Designing navigation entails multiple factors: user priorities, available content, corporate politics, etc. But during testing, I want participants to see tasks that are written from their perspective. The tasks use language they’re familiar with and represent real-world queries.

When writing Essential tasks:

  • Frame these questions in as close to the exact words a person would use.
  • Aim to make them as familiar and as comfortable as a well-worn pair of shoes.
  • Hold your ground against objections from the business. I’ve written tasks that seem completely unrelated to the navigation, but reflect the exact thing someone would be looking for.

What you learn from Essential tasks:

  • Responses to Essential tasks give you a sense of how usable your structure is. Tasks that don’t perform well help you prioritize what changes to make to your structure. Tasks that do perform well tell you what not to touch in the next iteration of your structure.

The diagram below shows results from a Treejack test. Small pie charts representing individual categories in the navigation are arranged in a web. Connections between these nodes show how test participants traveled between categories. The pie charts show the relative number of correct vs. incorrect clicks. Correct clicks are represented by the green pie slices and the green connectors. The yellow circles represent the category finally selected by the participant as their preferred response to the task.

Results from an Essential Question (node names removed to protect client confidentiality). We were happy about this outcome, which showed that most participants ended up in the correct areas.

(For more information about Treejack testing, check out Optimal Workshop’s web site.)

2. The Softball

Tasks that are super easy
These tasks use the exact vocabulary of the menu. They ask about one thing. The Softball is crucial because it acts like a control. If you see one or two people getting it wrong, you might be able to throw out their results surmising they weren’t really paying attention. If you see a lot of people getting it wrong, you know there might be something fundamentally wrong with your menu.

When writing Softball tasks:

  • Repeat the language from the menu in the question.
  • Direct them to content that lives in only one place in the structure.
  • Include no more than two softball tasks.

What you learn from Softball tasks:

  • In short, whether the participant is paying attention. If a participant gets the softball questions wrong, it casts doubt on their other responses. While I’d like to give them the benefit of the doubt, I have to follow the data. And it’s possible someone was clicking random responses.
In this Softball question, the answer could only be found in the category marked A. Responses in the B categories weren’t unreasonable, but not quite right. We felt pretty good about these results.

3. The Thesaurus

Tasks that change up the wording
Not everyone uses the same words to describe the same things. These tasks ask about obvious things using different terms. The intent here isn’t to change words for the sake of changing them, but instead to use legitimate, even common alternatives to what’s in the navigation.

On a current project, for example, the company refers to one of its offerings as a “framework.” If we ask directly about a framework, users are likely matching the labels — making it more of a Softball question. Instead, we’re going to ask about a model or solution.

When writing Thesaurus tasks:

  • Make them as specific as possible. The point isn’t to make them ambiguous, but to see whether users recognize conceptual similarities.
  • Target jargon-y elements in the menu. Despite our best efforts, sometimes technical terms appear in the navigation, and these are great candidates for testing.

What you learn from Thesaurus tasks:

  • When participants get Thesaurus tasks right, you know that the labels convey the concepts effectively. On the other hand, if not they may need additional clues–if not in the navigation then in the content–to direct them to the right place.
In this Thesaurus task we asked about a project about a specific topic in a specific location, and we used terms not typically used by the organization. This kind of “shotgun” graph is always concerning. It shows that the tasks was either too broad or wasn’t understood. Even though there were many places to find the correct answer (yellow), these results still show that the information “scent” isn’t strong when using alternate terms.

4. The Diverging Roads

Tasks with more than one right answer
These tasks ask about things that might live in more than one place in the menu. Good navigation systems help users differentiate categories, giving them clues about where something lives. Navigation testing gives you an opportunity to see how robust these distinctions are. These tasks are purposefully ambiguous, in a sense, because you’re trying to learn where people expect something to live among multiple possibilities. They are phrased in a way to avoid leaning participants in one direction or another.

When writing Diverging Roads tasks:

  • Avoid using words that would direct participants to one category above others.
  • Make the tasks as realistic as possible.
  • Avoid making the task too broad, which may direct users to upper-level categories.

What you learn from Diverging Roads tasks:

  • Responses might favor one category over another, or they might be evenly split between the two. With the latter, the navigation menu must clarify for users where the content actually lives. You might adjust the labels in the structure or consolidate categories in the structure. But the structure alone doesn’t need to be burdened with supporting effective wayfinding. These results can tell you where you might need to elaborate with other signposts in the UI, like adding a content block to the page to point to related content elsewhere.
This Diverging Roads task asked about a research report of a specific topic. We expected participants to go to “Topics” or “About Us” where the research reports were stored. That so many went to “News” was an artifact of the old navigation, which would post all reports to the News category.

5. The Missing

Tasks about what isn’t there
Some tasks deliberately ask about information not in the menu. Navigation can’t address every user need. I don’t sweat it too much because I know navigation is part of a total experience. In every test, I include one task that can’t be answered in the navigation.

When writing Missing tasks:

  • Avoid making it too obvious: Don’t ask about a casserole recipe on a banking web site. So, you’re asking about something that perhaps you’d expect the site to have.
  • Make it look like the other tasks. If your other tasks are short and to the point, don’t switch tone and ask about a highly detailed task.

What you learn from Missing tasks:

  • My favorite take-away from these tasks is learning which category users consider the “escape hatch.” That is, if they don’t know where else to turn, where do they go? I call these categories “escape hatches” because they provide users with a place to go when nothing else seems to fit.

Treejack provides another view into the results besides the connected pie charts. In this “destinations table,” the columns represent tasks. The example below shows a test with 18 tasks. The rows represent the navigation you’re testing. The number in each cell shows how many people nominated that category for the response to the task. The color shows whether that is a correct category or not. The example below shows “striping” on one of the categories. When people didn’t know where else to click, they clicked in that Products & Solutions Overview category. That is the “escape hatch” category.

Excerpt from a Destinations Table, another view of the results from TreeJack tests. It shows which categories participants selected for each task. When I see “striping” as in the second row of this example, I know that people used that category as an escape hatch.

Designing Tests

There’s a lot that goes into testing navigation. Designing a test, like designing anything, entails layers of decisions. One crucial decision (but not the only one) is the types of tasks you’ll include. A laundry list of things to find in the navigation is fine, but doesn’t surface real problems in the structure.

Because testing navigation is testing an abstraction, I like using a variety of tasks, that tease out insights in different ways. These patterns don’t preclude any particular topics, but instead give you different ways to ask about them. I’m not suggesting you ask about the same thing five different ways. On the other hand, don’t underestimate that the way you ask something in a navigation test is almost as important as what you ask.


Designing navigation? Pick up a deck of IA Lenses, a set of cards that ask tough questions about the design of digital structures.

Each card presents a set of questions about a different aspect of digital structures. The primary question on each card drives critical discussion about the structure and encourages you to justify your decisions.


Looking for an experienced team to help define your product, perhaps by digging into complex structures and navigation? EightShapes has been serving organizations like yours more than 10 years. We work on projects of all shapes and sizes, bringing to bear the best design tools and techniques. Have a project that could use our help? Let us know.