Bending the Research Rules: Measuring Sentiment
About 3 months into my role at SEEK, the Candidate product stream (that is, the side of our site that you’re probably most familiar with — looking for jobs) wanted to better understand job seeker sentiment toward SEEK.
As a researcher, this piqued my interest so I poked my nose in.
The goal in measuring sentiment is to provide a leading indicator of how our users are feeling about SEEK, regardless of their experience in finding a job. This would be used to adjust the things we were working on, and ensure we were solving the right problems. It was intended to be a rolling metric.
Previously, the SUPR-Q and the NPS had been used in an ad-hoc fashion in individual product spaces, but not to measure sentiment specifically.
There had been some discussion around using SUPR-Q to measure sentiment — because SUPR-Q covers loyalty, appearance, trust, value and comfort, and usability.
However, any researcher worth their salt knows it’s important to get stakeholder buy-in from the get-go. (For a more in-depth discussion on that old chestnut, read this article). So, we surveyed the product stream first to understand what people thought sentiment meant, and how they expected it to be measured. This also meant we got ourselves aligned on the idea.
Remember, the title of this article is bending the rules. At this juncture, I was already bending the rules by getting stakeholders involved in defining what we were measuring, how, and why. A research purist will tell you that’s a bad idea, because stakeholders might not understand what constitutes “good” research.
A pragmatist will tell you that getting your stakeholders involved sooner will mean they’re more likely to listen to and use your research results (in some way).
The survey asked our product stream what sentiment meant to them, their product area, and what they’d like to learn. We also asked how they thought this information (i.e. sentiment) would be used, and how sentiment could be measured.
It turned out that what the product stream wanted to learn from measuring sentiment, differed to how it had been measured previously. Metrics like the SUPR-Q and NPS only tell the “what” part of the story. What the stream were after appeared to reflect a need for some qualitative data.
The product stream told us that candidate sentiment means:
- feelings toward SEEK or about SEEK
- candidate experience and emotion
- why a candidate uses SEEK
- whether they would come back
- how they go using SEEK
It was not how a candidate feels about not having a job (more on that later).
There’s a lot to unpack here — the above responses incorporated a little bit of usability, a bit of loyalty, a bit of trust, value and comfort, and a bit of qualitative stuff — the why behind the what. So, a single metric probably wasn’t going to tell us what we needed to know.
Our survey respondents also told us that they planned to use the learnings in the following ways:
- as a leading indicator of what needs to be fixed
- to help us focus on the right priorities
- to shift us towards looking at the end-to-end experience
- to understand the most important problems and prioritise work
- to understand the effectiveness of work we do.
After the survey analysis was complete, I ran a workshop with the Product Managers, UX designers, and Delivery Managers to work out how we were going to tackle this.
Based on the above survey results, I decided to put together a simple approach for measuring sentiment, and get feedback on it during a workshop.
First, I jumped on Google to see what other researchers had to tell me about measuring sentiment.
“Traditional” sentiment analysis seeks to understand what words people are using to describe your product, and what that means. Usually, you’d be looking at things like customer feedback data, or social media, and some kind of AI wizardry would aggregate that for you. At that time, we didn’t have anything that could do that neatly. Secondly, customer feedback data and social media can be overly skewed positive or negative.
I put that thought aside for later, and tweaked my search to see if there was anything “UX-y” I could do — such as any UX research metrics, or other ways to solicit users’ thoughts that addressed the research objectives as shaped by the team.
I happened across this article, which summarised nicely how one could capture user sentiment as part of a usability test, using a combination of emojis and Microsoft Product Reaction Cards (MPRCs).
I also looked at Nielsen-Norman Group’s article on the cards, and reflected upon my own experience of using emojis and language as a way to gather sentiment in a diary study. It had worked pretty well then — we asked users to choose an adjective and an emoji to describe their experience.
So, based on what the team wanted, and what others had done before, this is what I came up with:
Conduct face-to-face usability testing, incorporating the SEQ, SUS, and SUPR-Q, and throwing in some to-be-defined sentiment exercises using emojis and the MPRCs.
I figured that doing face-to-face research would enable us to understand the “why” behind the what. Secondly, if the sentiment portion was a bust, we’d at least get a usability baseline out of it.
An on-site survey to collect SUS and SUPR-Q, collected at key interaction points on the site
The idea here was to keep it simple with the collection of metrics, building on previously collected SUPR-Q data. Collecting it at key points across the site meant we’d get a nice cut-through of what people were doing on-site.
Looking at aggregated customer feedback data
Given we already had this data, this idea was popular, but we didn’t have a neat way of pulling this all together and analysing it.
During the workshop, I ran through the survey results and the above approach, asking people to vote on and talk through what approach they felt would work best.
It was decided that we do the face-to-face research first. Then, we’d use that as a basis to survey a wider audience, collecting the SUS and SUPR-Q, and asking the same sentiment questions.
At this stage, I wasn’t feeling 100% sure that this would work. This approach *did* address the research objectives, but I’d sort of smooshed together some methods as a result of my own experience, along with some magical Internet resources. It wasn’t “tried and true”.
So, I ran this past my boss at the time. He also jumped on Google and came to much the same conclusion I did.
In an earlier conversation, he’d told me “You know all the user research methods out there — perhaps it’s time you started to break some of the rules”. I figured that was as much of a blessing as I was ever going to get.
I began to flesh out the interview guide for the face-to-face research. For the sentiment portion, we’d ask participants to complete two sentences about how they felt about SEEK (e.g. “SEEK makes me feel…, I would prefer to use SEEK if…”). Then, they’d choose 5 words from the MPRCs, and choose an emoji that best described their feelings toward SEEK.
After that, they’d attempt the usability testing portion, looking at 5 key tasks across the site. The metrics collected here were the SEQ, SUS, and SUPR-Q. I recruited a mix of people who used SEEK as their primary job seeking site, and people who hadn’t used SEEK, or used a competitor as their primary job seeking site.
We’d repeat the sentiment exercises after they had completed the usability portion of the session, performing the defined tasks on the SEEK website. I had a hypothesis here that usability would impact on sentiment.
After the face-to-face rounds were completed, I put the SUPR-Q and SUS surveys up on the site as intercept polls. I also put up the MPRCs and the complete the sentence exercises as a way to collect some qualitative data. 2000 responses later, the results were much the same as what people told us in the face-to-face testing (and I never wanted to look at Excel again!)
So…did it work?
We found that because SEEK is…well, pretty usable, it didn’t have a huge impact on job seeker sentiment toward SEEK.
With regard to the MPRCs, the words participants chose tended to be fairly practical and functional. A lot of the other data we collected reflected things we already knew about what job seekers found valuable on the site, and what they wanted in addition to that. These things played into sentiment.
Secondly, despite the team wanting a view of sentiment toward SEEK, and not how job seekers feel about the job-seeking experience — we found the two were inextricably linked. No matter what, a person’s sentiment towards a means of getting a job is going to be related to their experience of getting a job!
I’d like to run this research again, to see if the things people raised as issues have changed, or are still true. Our product stream seemed to get the most value out of seeing verbatim from users, and seeing their responses to what exists at the moment, and what’s not yet on the site.
However, I don’t think I’d run it as a usability test sandwiched by sentiment again — it’d likely be a contextual research piece to make the experience more “real” for the user. I would still use the sentiment exercises in some way, and follow it up with on-site surveys again. I think it’d be useful to do this sort of thing every 6 months — tracking some of the negative responses, or “warning signs” and see if they get any bigger.
Remember how this metric was intended to be rolling? We can’t really run the surveys on-site constantly — that would slow down the site experience and be way too much data to sift through. We’re still working on a way to aggregate all of our customer data into one spot. When and if that happens, expect to see another article!
With thanks to Mark O’Shea, Leah Connolly, Kayla Heffernan and the Candidate product stream at SEEK. Y’all are awesome.