Machine Learning & Qualitative User Research
In the 20th century, the French anthropologist Claude Lévi-Strauss proposed a theory of “binary opposites”. While a complex and often misunderstood topic, binary oppositions can broadly be defined as a pair of terms, ideas or concepts that are related, but often played off against one another. For example, the left vs the right, good vs evil, and so on.
In recent years, qualitative and quantitative research has been treated as binary opposites, with “design” vs “science” on opposing sides. For example, consider the following from The Atlantic:
“design research isn’t a scientific endeavour aimed at finding truths. Our clients typically can’t afford the large sample sets and extended time frames necessary for such a “scientific” process. And sometimes design teams don’t have the patience to see the value in dragging out a study in an effort to make it scientifically or statistically significant.”
However, I believe this is an unnecessary binary opposition that pits one side against the other. In reality, customer research is a nuanced, blended approach that applies the right tools — whether quantitative or qualitative — at the right time in the process. The advent of computing and in particular, Bayesian statistics and machine learning, means that there is no excuse for avoiding quantitative datasets as part of design research.
Within the insights team at 383, a typical part of our research process is talking to people. Whether via Jobs to be Done (JTDB) or a a scripted interview, qualitative researchers typically aim to understand user and business needs via the age-old method of discussion. While you should always implement rigour and best practice to the process, there is sometimes concern that “enough” users haven’t been spoken to, or how is it possible to scale these insights.
If we were to simply focus on the binary opposition of quantitative vs qualitative, then we would resign ourselves to the idea that interviews sit firmly on the side of qualitative “design research”, as per The Atlantic example, and operate our “scientific endeavours” elsewhere.
“The tensions between qualitative and quantitative work often seem more tribal than reflecting methodological differences.”
However, given the emphasis on a blended approach, we have found more valuable insights from combining qualitative and quantitative research.
When transcribing and analysing customer interviews, we often begin to find overarching themes and topics. During a recent round of interviews for a service, we found that users typically talked about two to three themes for why they utilised this service. Taking these insights, we then used machine learning to understand how these themes impacted the broader customer base.
In detail; we used the Twitter API to capture public tweets to the service. We then used a method of machine learning known as Naïve Bayes classification to understand the sentiment associated with tweets and the topics discussed. From this, it became possible to understand and validate that the customer base were talking about the themes identified within the user interviews and also, whether they felt positively or negatively about the service in relation to them. Furthermore, given that Twitter was operating as a “last chance saloon” for customer queries, we identified that a major issue was customer service, including timeliness in responses and issue resolution through communication channels such as phone and email.
This quantitative data, combined with our user interviews, allowed us to not only get to grips with the jobs (a term in JTBD) around the service, but also understand more broadly how these interview themes were prevalent for the customer base. We were also able to understand a major pain point, namely customer service, that was impacting users and didn’t surface as clearly through customer interviews.
This method of combining machine learning with user interviews is not without its drawbacks. Typically, you’ll need to work with a client large enough to have its own pool of customer feedback data. Similarly, there is some work around creating a training set and a classification model appropriate for your project. Finally, while you will always want more data, 65%-70% accuracy is typical for a project with less than a week’s worth of data collection and as such, be prepared to spend time verifying and updating the data manually.
With binary oppositions, we pit our research methods against each other. However, by moving beyond this, we can focus on our end goal of using a range of research methods to improve the customer experience.