I’ve been in the world of User Experience (UX) since 2012.
Most of my roles have been research-oriented, which means I’ve put myself in the shoes of website users, app users, device users…hundreds, if not thousands of human beings. Hours upon hours of laughter, frustration, and even some tears.
User researchers like me open up their hearts and minds to others all day, all year, all their careers.
User research professionals usually score on the high end of emotional intelligence tests, and for so long, across so many industries, emotional intelligence was touted as *such* an important factor for success in your job.
And now…Artificial Intelligence (AI) is the shiny new thing that’s supposed to differentiate you in your professional career.
It feels like getting ahead is all about mastering your prompt-writing skills.
Fans of AI tools proclaim, “It’s about doing in minutes what used to take you hours!”
In the last few years, I’ve heard industry podcasts and webinars and conference speakers tell us that the professionals who embrace AI and weave it into their work are gonna outstrip all the old-timers and usher in a new, highly-adaptible, wicked-fast sort of workforce.
I’ve even got vendors telling me they’ve amassed data from enough real-world users to predict their reactions to future designs. So like…they’re simulating test participants instead of actually recruiting human beings for research.
Okay.
I guess I need to practice prompt-writing every day, learn how to ask AI the right questions, and try to recognize more moments when I could turn to it.
I guess a tool informed by thousands of data points probably would be able to spot some trends and safely draw conclusions based on the memory of human beings rather than the reality of them.
But sometimes using AI in the context of user research feels a little like ripping my own heart out.
“But Ki,” I hear you wonder, “you’re clearly using AI to make images for this post. So you’re full of malarkey and poppycock, aren’t you!?”
To which I reply:
- First and foremost, you really need to update your slang. Poppycock? Really?
- Secondly, writing on Medium is not compulsory to the practice of user research, it’s just a hobby, and…
- Finally, I’m desperately hoping to avoid being this guy:
That means I’ve got to open up to this AI stuff.
I’ve got to reflect on where it’s already folding into my work, and more actively experiment with how to let AI supplement my own work.
I still consider myself extremely new to this technology, and very reluctant to let AI do the heavy lifting. But I thought I’d document some AI + user research collabs I have attempted so far, and my thoughts on the results:
One: Writing a FIRST DRAFT of a survey/interview script
I want to distinguish between a full test plan and a script. Test plans include study objectives, methods, stakeholder roles…basically everything related to the test. The script is the screener questions, instructions, and questions that the testers go through to complete their project.
Before I ever popped into an LLM, I talked over the project with stakeholders and had a solid set of objectives and a desired method typed up. Then I plugged it into the ol’ AI machine and hoped for the best.
It actually did a pretty decent job of developing 15 or so questions, but to me they felt stiff and formal.
I tried telling it that, and all it did was apologize. 😆 A lesson in refining prompt writing, I guess.
The questions written by AI were also pretty blunt, basically regurgitating the words from my goals/methods and repackaging them as direct questions. I’m paraphrasing, but think about something like:
- Me: “The goal is to understand what challenges customers faced while shopping online”
- AI: “Question 4 (open ended): What challenges did you face while shopping online?”
It also didn’t include follow ups or any branching logic. Again, something I could probably prompt it to do, but it started to feel like it was wasting my time real fast.
In fact, I spent more time writing reshaping prompts (and then reading the updated versions and looking for the changes) than it would’ve taken me to just write the questionnaire the way I wanted it myself.
So for writing test scripts…4/10 stars. I wouldn’t recommend unless you HATE writing test scripts. And are you even a researcher if that’s the case?
Two: Analyzing individual session transcriptions
Our research repository includes AI analysis on a session-by-session level. I open up an uploaded video, hit the fancy button, and their AI highlights moments in the video that it thinks are relevant.
Great idea in theory, but when I started combing through the suggested moments, I kept shaking my head. Like, the user was complaining about the web conferencing tool, not the tool we were testing. Or it caught the user paraphrasing someone else, talking about how much they loved something.
There was a small section at the top of the video where I could ask a single question and the platform would try to highlight relevant clips of the video, but I don’t want to type in my research questions one by one and have to read as well as accept/reject each suggestion.
At some point, just reading the whole transcript ONCE and flagging what’s relevant to my research goals takes less time.
So for analyzing individual sessions/data from a single user…4/10. It might score higher if I could plug in all my research questions at once and only had to do one pass at the suggested highlights.
Three: Analyzing a batch of user interview notes
Not too long ago, I had the opportunity to do classic, behind-the-glass observation testing at a market research office in Miami.
I spent 3 days with 4 other people taking notes (and drinking Cuban coffee). The stakeholders wanted results FAST, and I was about to go on vacation.
So on my flight home, I popped open our internal LLM tool and uploaded my notes to see if it could summarize things for me.
To my surprise and delight, this worked incredibly well — I only felt the need to tweak one or two of the bullets and had to do very minimal re-prompting.
I popped a couple of screenshots in alongside the AI-generated bullets and fired off the email summary before I left for PTO.
Here’s why I think this worked better:
- As I took notes, I was already translating things into our internal language. Users kept overlooking a ‘continue’ CTA on a particular page, and every time it happened, I was referencing the issue the same way within my notes (‘banner blindness on the VFT page’). If it had been analyzing transcripts, I doubt the AI would’ve known the name of the page where folks got stuck or understood why they were getting stuck.
- My notes were in outline form. When users moved from one topic or site section to another, I was heading a new section with the topic or page name and tracking both what they said and what they did. On its own, I don’t find AI very capable of analyzing that latter piece.
So using AI to analyze a batch of session notes in a crunch: 8/10! If you’ve got (or you are) a consistent note taker, this might save you some time summarizing trends in your interviews or unmoderated sessions.
Four: Summarizing everything we know about a particular page, term, or feature we’ve tested multiple times
Again, this example comes from within our research repository tool, which houses over two years of collective knowledge about our users gathered by UX Research, Analytics, Experimentation, and Digital Strategy teams.
There’s always someone changing teams, joining the team, or asking us for a recap of EVERYTHING we know about some part of our site.
And while our internal labels are one way of pulling that up, our repository tool also has an AI summarization feature that lets us ask something like, “what problems do users have when attempting to build & price a vehicle online?” Within a few seconds, there’s an AI-generated summary, with linked annotations to data and reports, to help get the lay of the land.
Pre-AI, this activity usually required a full Literature Review project, which could take weeks and tended to be a bit tedious — if it was deemed enough of a priority to do at all.
Now folks can get up to speed in a matter of minutes…
…and that’s what gives me pause.
Because like I said at the beginning of this post, there are rich emotional moments in our research. High highs and low lows.
A single story or a 5-minute stretch of conversation can bring a user’s pain to life and light a fire in the heart of the right stakeholder. It can give them purity of focus, a sense of drive, and the ability to prioritize what’s right for both the business and the customer.
What are the chances that an AI-generated summary is going to provide that kind of inspiration?
I’ll give my stakeholders the benefit of the doubt and hope that they don’t just read the AI-generated summary. Maybe they click through the source materials and clock a little exposure time.
But when so much of the glowing praise about AI has to do with saving time, I can’t help but worry we are just using it as a way to avoid having a heartfelt connection with our users.
So when it comes to using AI as a way to summarize our collective knowledge about a topic, I’m gonna give it a 6.5/10.
Five: AI-generated design evaluation
One of our vendors has done thousands of web design tests across numerous industries. They have standardized benchmarking questions that are asked for every design. Their participants have marked up key pages of those designs with likes and dislikes, along with open-ended comments explaining their choices.
The vendor used this extensive data to inform a new offering — you upload up a design or submit a link, and it will predict what users might like or dislike about the design.
It also generates some potential personas you might want to consider (since we work in the automotive space, our personas are folks like an electric vehicle shopper, a parent responsible for transporting kids to sports events, the sports car enthusiast, etc.).
This tool is still pretty new, and I wasn’t expecting much. But I have to admit…I was once again pleased with the results.
For example, I uploaded a page that used the term “badass” in one of the headers, which doesn’t really match our style guidelines.
Sure enough, it highlighted the header that used “badass” and suggested using caution with language that might seem informal and unprofessional.
The personas had cheesy names like “Soccer mom Suzy” (not real, just an example), but they were pretty well crafted, too. Similar to the target demographics and screening criteria we use ourselves.
I mostly cringe at the notion of AI-simulated users, but for whatever reason, this application of AI (informed by real, recent human feedback) feels less *wrong* to me. Still more like a ghost or an echo of a person as opposed to the real thing…but maybe it has its uses.
Part of my cautious optimism is that I know EXACTLY what methodology they’ve used and what kind of data they’ve captured. There are standardized questions and activities that thousands of people have done. I’ve seen comments like theirs in my own projects, and they feel true and human. The AI is fed by a clear and closed data source (as far as I know, anyway).
I also appreciate that their recommendations and warnings are well phrased — they aren’t saying “users will hate this thing”; they’re saying “some users who value X may have trouble with this thing”. They offer directional guidance and thought starters, as opposed to shoving the designer away from a choice they’ve made.
While I don’t love the idea of our designers relying on AI predictions, here’s the reality: our team’s ratio of designers to researchers is ~8:1.
The UXR team doesn’t have the capacity to pester every designer to bake regular testing into their work. Designers vary wildly in their ability to build strong testing objectives or conduct projects by themselves. In those instances, I can see using an evaluative tool like this as a “gut check” on the design, a way to forecast potential risks, and maybe a way to develop clearer research goals or hypotheses.
If the ghost of human insight is the best I can offer a designer in a pinch, I guess I’m kinda on board. 8.5/10.
Like I said at the beginning, one trademark of a good user researcher is openness. They’re open, and they encourage openness in others.
There’s a lot about being a user researcher that couldn’t and shouldn’t be done by AI. (That deserves its own post; stay tuned.)
There are times where I still cross my arms and shake my head when it comes to layering AI into user research activities.
But now that I’ve spent some time documenting my own experiences, I’m feeling less like I need to go scream at clouds.
I’ve set a goal this year to unclench a bit and dedicate some time each week to learning. I’ve already found some videos and articles out there with tips, tricks, and demos to get my feet wet and up my skillset.
Creating the images for this post was my first time using an AI image generator. Even though some of the results were way off and I had to re-write prompts a few times, I have to admit: I had a ton of fun!
I can’t help but lose heart every now and then when I see AI encroaching in the field of user research. But I can try to embrace bits and pieces where it makes sense and feels right. And now I realize that I might even have fun along the way.