How I Do User Research

Published in

HCI & Design at UW

15 min readNov 29, 2014

A combination guide and “how I work”

There’s a great deal of variation in what people mean by user research, which depends on the purpose, the context, and so on. This article is a brief overview of my research toolkit, and here’s the tl;dr

How to Ask Research Questions (write it down, iterate)
How to Conduct Exploratory Interviews (it’s usually worth doing, and the way to interview well is to be flexible and an active listener)
How to Interview to Evaluate a Prototype (be specific, and ask user study participants to compare and contrast, not to criticize)
How to Compose a Survey (keep it short and simple, don’t force opinions, have at least one open-ended prompt for “debugging”)
How to Discover Themes (qualitative analysis is a systematic way of using the human capacity to categorize things)
How to Use Theory and Related Research (if you want to incorporate scientific research, here’s some general ideas on how to approach that)
Fundamental Principles of User Research (or, what I think the fundamental principles ought to be)

How to Ask Research Questions

Recently, a colleague asked me: “what, in lieu of a hypothesis, can consistently guide exploratory research? How do you know that you’re on track, not getting lost in the weeds?” Both in exploratory and experimental research, there are elements of “getting lost in the weeds” in the absence of a good research question.

Charlotte Lee — queen of rubber duckies and a fantastic qualitative methods professor at UW HCDE— recommends having an actual piece of paper, actually displayed somewhere visible in your life, like taped to your machine, desk, or wall, with the research question written on it. She recommended replacing it with a new one when it seems appropriate.

I manage my tasks and notes using a paper journal, and I have taken to just writing my research question at the top of the page on a day when I’m focusing on that study. Generally, I am working on multiple studies at once, and while it’s possible to make progress on multiple studies in one day (eg, email people back, sign things, etc.), I like to keep intellectual focus on one study in one day.

When you’re getting lost in the weeds, the question won’t sound like much of a question. “Investigating how people pay for stuff on the Internet” is not a question, it’s a statement. “when and why do people drop out of the funnel to complete the purchase?” Over time, the question gets more specific, and the “why” becomes “enumerate and characterize reasons.”

For example, people may drop out because (1) their credit card is not on-hand and it’s an impulse purchase, (2) because the authorization procedure requires a pin through SMS, and the phone is not on-hand, or (3) because they got distracted and navigated away. You can get to these themes with some interviews, and then you can use surveys to measure their prevalence.

With these categories in hand, you can begin to ask quantitative questions, using instrumentation of a site to determine the degree to which each of the themes (1) occurs and (2) is a problem that you can or want to solve through design. Generating the themes, though, is the purview of qualitative user research, which is the subject of the following sections. The job of a good research question is to coherently guide you from open-ended exploration to concrete quantitative analysis. Your job, in doing user research, is to routinely hone the research question, and — probably—keep it relevant to the design of whatever it is you’re there to help design.

Sometimes, instead of getting more specific, the research questions get more vague. This is a good thing: it makes it easy to see that you’re changing course. Otherwise, you would have changed course and not even noticed it, which would be worse. Notice if the question feels more vague, and reflect on why you are changing course, and whether it is a good idea.

You are never given a wish without also being given the power to make it come true. You may have to work for it, however. — Richard Bach, Illusions

Besides thinking deliberately about the question, think also about what it would take to answer it. Maxwell’s data-planning matrix is one of my favorite research planning and design tools; it is described in pages 22-23 of this excerpt of his chapter on methods taken from the excellent (and short, and practical) handbook on qualitative research design. This planning stage recommends decomposing your current question into workable chunks of information you know how to answer:

What do I need to know?
Why do I need to know this?
What kind of data will answer the question?
Where can I find the data?
Whom do I contact for access?
Time lines for acquisition

The above guides are all from Maxwell’s data-planning approach. I like to add the following:

How much investment in infrastructure (building a program, etc) is necessary to get this data? Can I reuse something I built for another question?
What would I need to do to make this something that can be delegated, such as to a student or an inexperienced researcher? I genuinely believe that most things can be successfully delegated in a way that empowers and teaches, but open-ended tasks are much more difficult to delegate.
What related work exists in CHI, CSCW, UIST, UbiComp, and other top conferences and journals that study humans doing stuff using computers? Starting points include ACM Digital Library and IEEE Digital Library and Google Scholar. Although many articles technically require paid access, a lot of the time you can just search for the name of the article and find a PDF for it. Look for two-column conference papers (4 to 10 pages, usually) or dense journal articles (15 or more pages with a lot detail and charts, usually) that have been peer-reviewed, rather than short poster abstracts that have not.

How to Conduct Exploratory Interviews

An exploratory interview is the first thing you should do, I think, in so many cases that it may as well be always.

Do have a “protocol” — a series of interview questions—but not one too rigid. The semi-structured interview is my favorite method. The “structure” is a list of 3 to 7 questions that can each start a conversation. All participants are asked all the questions in the same order, but with a lot of follow-up questions in between.

Do memorize the first words to a participant. Memorize the “elevator pitch” for why you’re interviewing them and who you are and if they have any questions. Memorize also the first question; for the rest, you can consult a piece of paper, but the first one should roll off your tongue.

Do ask about specific events or actions, “When was the last time you used [app]? What was the context?” The moment someone starts telling you about how she usually uses an application or what she likes or dislikes, she is relaying a narrative she thought about and constructed in her head. This narrative is full of rationalizations and retrospective details that make it a lot less useful than a more concrete account of actual events.

Do ask interesting questions. Asking questions like “what was the most surprising thing you noticed when you were reading tweets about #YesAllWomen?” is a more interesting question than “what did you think about #YesAllWomen.” Similarly, questions asking a person to imagine are more interesting than asking them to criticize— but still get you information about what people are currently lacking.

Do invite your target audience to get passionate. “What’s the best thing about ___? What’s the worst thing about ___?” That said, remember the “be specific” rule: Don’t ask what they like or don’t like. Make them describe a time that was awesome that involved an app you care about, or a time that really sucked. “What,” “when,” “who,” and “where” questions — rather than “how” or “why” — keep things simple and concrete. Always break down a “how” to “…and what happened next?” and take a “why” into “Who else was involved?” and so on.

Do approach participants when they are in groups and you, the interviewer, are on your own. Let people argue among themselves. This does not have to be as organized as an actual focus group, but can still be illuminating.

Do keep a list of participants where you mark of relevant characteristics. Start with demographics, and then add characteristics as you become aware of their importance (such as experience with a particular technology; education level; social situation; level of enthusiasm for something; etc)

Do make deliberate sampling decisions based on these characteristics — either by focusing on a specific type of user that is more of your target audience, or by maintaining some balance between different types of users. Your definition of type can and should change as you continue to develop an understanding of the people you interview.

Do audio-record the whole thing, and transcribe verbatim. I recommend doing the first two or three of a project by hand and getting the rest transcribed. The act of transcription is equal parts painfully-tedious and a fantastic way to approach the interview anew from a more analytic perspective, much more deeply than just reading it later.

How to Interview to Evaluate a Prototype

Do ask people “Which Do You Prefer?” and not “Will you use this?” Ask user study participants to compare and contrast rather than to criticize

Do use paper prototypes first and make things as rough-looking as possible; remember, if you make things pretty and in color, people will just tell you that you should improve the colors

Don’t reinvent the wheel; use standardized usability questionnaires.

Do distinguish between identity and utility. Almost every social application has two purposes that can conflict dramatically: to express an identity and to consume information. Data about how information is consumed on a social application can be a private thing even if revealing some of it is useful. Clarify how people see any particular action, don’t assume.

Don’t audio or video record anything unless there is a really good reason to do it. It’s a pain to set up, it makes people uncomfortable, and every hour of recording is extra work for you to watch, annotate, transcribe, or whatever. Just have someone taking really good notes, and make time after the interview to debrief, type up the notes and your impressions, and that’s it.

Do set up sessions where there are two researchers (one to run the session and one to take notes) and one participant (or however many to minimally test your system.

How to Compose a Survey

Whether to further understand whatever you found in exploring interviews, or to evaluate more systematically, surveys can be great.

Do use some, but not too many, questions from usability questionnaires — they’re detailed but tedious, and many questions just may not apply to what you care about.

Don’t waste people’s time: keep the survey as short as possible. To a single page, if you can. Keep the questions brief, too.

Do ask agree-disagree questions thoughtfully. The 5- or 7- point “Likert” scale is a common tool for getting hard numbers on subjective questions. It’s important to have some redundancy (those usability questionnaires can be a helpful example), so you can sanity-check responses. Remember, too, that some surveys have the scale from left to right (“strongly agree to strongly disagree”), while others reverse it; make those labels prominent just in case, and have a throw-away question or two that would help filter out confused responses. Finally, a 7-point scale is preferable to a 5-point scale, if simply because many respondents either shy away from extreme options, or prefer them, so if you want to maintain some meaningful gradation of response strength, 7 points works better.

Don’t force opinions where none exist. Although using a scale with an even number of points (e.g. 6) can force people to decide one way or another, the reality is that with many usability or satisfaction scales, some things simply don’t matter to some people. Imagine if you have a third of your responses as “strongly negative” and two thirds as “don’t care.” This means there are (at least) two different kinds of users, and whether you act on that response may affect about a third of the target audience, which may be a smaller priority than if everyone was forced to pick, and something that is a non-issue as a result appeared as a pressing problem. For a similar reason, be sure to include a “N/A” option.

Do have open-ended prompts. Even if you’re only planning on using the scale responses, I highly recommend leaving in several open-ended (or “write-in”) prompts, including one at the end for general questions. Open-ended “Tell me about ___” questions serve three purposes:

They prime the respondents for the upcoming quantitative question, so they have some sense for what you care about;
They help the respondents feel like their opinion is valued and they are not being reduced to a number;
They help you debug your survey. If the open-ended responses make no sense, that could be a good sign to pull the plug and try again, with better language.

How to Discover Themes

This is a tough one. When you’re looking at an interview transcript, and everything is fascinating at the same time as nothing is, where to start?

Qualitative coding: what is it, actually?

In qualitative coding, I apply a code — or a term to summarize the relevant phenomenon demonstrated — to complete ideas. In interview transcripts, this can be a single sentence; half of a really long, disorganized sentence; or paragraph, if it’s full of tiny sentences that don’t stand in their own. It depends on the conversational style of the interviewee and the specific code. The rule of thumb here is to remember that you can later query by a code to get all the examples; at the very least, the result of a query for any code should make sense to you.

I recommend Dedoose for collaborative coding. Much like a research question, a code or theme may change in articulation over time. I recommend starting specific and verbose (“concern that browser activity can accidentally trigger unwanted download”) and then making it shorter if it becomes a recurring theme (“DL fear”). Qualitative analysis software, like Dedoose, NVivo, Atlas.ti, and so forth, provide support for managing and evolving collection of codes. This software allows querying by code and attaching memos to codes. As a free alternative, I also recommend saturateapp. When I am the only person doing analysis, I just use a spreadsheet, with participant responses broken into one-or-two sentence segments; each segment gets a unique ID and it’s own row, and then I successively add columns for diffferent codes as they come up. I genuinely like this method but it absolutely does not work if there is more than one person doing analysis, so familiarity with software is recommended.

Give up on parallel concepts and completeness

When I first started to categorize qualitative data, my impulse was to internally construct the outline: “if you apply a code for ‘anger,’ then other codes must be feelings as well, and furthermore I should be looking for happiness.” There are two core misconceptions. First, that codes must be of the same type (or, “parallel,” if you will). If all your codes are parallel and mutually exclusive, that forms a single axis or axial coding, which is a later stage of analysis than open coding.

Second, you generally should not count occurrences of codes. This is strictly about existence. Non-existence means nothing. It means that maybe you didn’t ask the questions. Counting code occurrence in a single interview also doesn’t mean much besides the possibility that the participant talked a lot about this one thing, either because they really cared about it, or (more likely) that they didn’t care about the other things you asked about much. You cannot talk about what is not there, You can only talk about what is there. If only “angry” is expressed but not “happy,” then your emotion codes include angry but not happy. It’s fine.

**Some themes of themes to get you started**

“In vivo” codes: this refers to codes that are named after participants’ own words, and can be among the most useful and interesting. For example, in a study of how people use online health resources, a few participants used the phrase “keeping an ear to the ground.” This in vivo code was then further applied to other explanation of information monitoring, and eventually became a major theme.
Emotional expression and mental state: comments or accounts of feelings or thoughts (I was angry; this was confusing)
Articulation of social norms or expectations (“..but obviously nobody would like an old photo that would be creepy.”)
Category of social technology use (private communication; organization; information dissemination; entertainment; education; persuasion)
Breakdown type (lacking feedback for whether succeeded or failed; lacking guidance on what to do next)
Mental/conceptual model of software (“so obviously I didn’t want to open the email in my browser because it showed there was an attachment there” — you might, for example, code this both as “mismatch between user mental model and system” and “concern that browser activity can accidentally trigger unwanted download”)

How to Use Theory and Related Research

I cannot recommend enough looking for related work — in HCI-related publications as well as just psychology, education, communications, whatever is relevant — during analysis. One trap of user research is spending a great deal of time and determining that “people use the thing” and “every phenomenon we noticed is related somehow to every other phenomenon.” At least explain how things are related, and try to say something somewhat stronger than “some behaviours happen.” Existing related work can provide excellent basis for making such stronger statements; if nothing else, reading related papers helps you brainstorm hat relationships you want to look for in your data.

Theory what now?

There are three kinds of theories:

Explanatory theories seek to explain the behavior of our world; they tend to provide a more conceptual model of the world. Predictive theories seek to predict outcomes based on the changing values of component variables. Generative theories generate guidelines and principles that provide useful and applicable knowledge and models. — Giacoppo

Or maybe there are fourteen. I dunno. In a lot of ways, the jury is still out. For the purpose of this conversation, I contend that these are the types of theories (or, more broadly, related work) you should care about. I call it the People-Use-Things Theoretical Categorization of Theories:

“Person uses thing in X context / for Y reasons” — theories explaining something about an individual using technology, with focus on the individual. Most useful for answering questions that you want answered but don’t want to re-do the work of answering.
“People do things together, using thing” — theories focusing on collective or social activities as mediated by technology. Also useful, but social technologies are at the mercy of even more factors
“There are people-thing combinations” — theories that categorize social or individual behaviors as coupled with particular technological contexts or affordances. These theories are very informative, though sometimes difficult to distinguish from their far more vague cousins:
“There are people and/or there are things” — theories that categorize users and/or technological affordances, but not the intersection of the two. Rarely applicable beyond the original context of study, and possibly the unfortunate result of running a study, discovering that “people use the thing,” and stopping short of going further
“People are; things also are” — something calling itself a theory or a finding does not necessarily mean it is either. Often you get basically just accounts of some people and/or some things without organization or theorizing. This can be useful for brainstorming and early on in the analysis. It is infuriating and/or boring at the end of a project.
“You think there are people and there are things? Think again.” — Like the above, this is more useful in the beginning than in the end. Unlike the above, this type of prior work provides a nuanced conceptual framework. There are lots of theories that are really cognitive tools, providing ways of thinking about the world of things and people. For example: ANT, AT, and DCog are some of the most popular such theories in the study of humans and technologies.

Fundamental Principles

Or, Ideas That I Think Ought To Be Fundamental

Principle 1: The creators of technology are in a difficult spot: on the one hand, we are users of technology; on the other, we are often not typical target users. As such, we can be limited by unspoken assumptions. Open-ended study of how people use technology is important in uprooting our assumptions about what technology does, is, or could be.

Principle 2: Technology and its use is pervasive. Besides hardware questions (eg, device ecosystems or networking infrastructure) and software questions (eg, compatibility and social buy-in), there are questions of social norms (eg, texting on a date) and policy (eg, Google Glass in a movie theater) and access (eg, blind users and voice-over, or users with cerebral palsy and alternative pointing devices). The researcher must get creative and and use a variety of methodological tools to understand anything at all about how people use technology.

Principle 3: The subject of user research is the technology or its use. People are not your subjects; they are your informants or participants. Remember that they (most often) get involved (1) because you are nice, and/or (2) because what you’re working on is cool. The researcher must actively pursue constructive criticism through deliberate method selection.

Kit Kuksenok a Berlin-based researcher and data scientist. They hold a MSci (2014) and a PhD (2016) in Computer Science and Engineering from the University of Washington (Seattle). Feel free to reach out with feedback or questions on here or through my contact form.