Should ML Topic Modeling replace your rules-based analysis?

Corina Paraschiv
Mixed Methods Research
4 min readMar 9, 2024

--

We deal with large sets of data at work, perhaps you do too. As a natural path to speeding up work, we look at ways to leverage our computer’s power.

In a recent project, I was working with a large dataset comprised of user-generated text. My goal was to extract the main topics being discussed by participants in a survey. The volume of data being too large to manually process, I considered different options.

As I started project, I faced the following methodological decisions.

Key methodological decisions for machine-learning topic modeling : (1) degree of automation, (2) reducing bias and (3) capturing subjective meaning

These key questions allowed me to select the right approach for our project:

  1. A first round of analysis with rule-based coding as primary findings
  2. A second round of analysis using Machine Learning to augment findings

In combining these two methodologies, we were able to meet the following criteria:

  • The computer-aided interpretation accurately depicted participants’ voices, with reduced researcher bias
  • The subjectivity and variance coming from various participants were aptly captured
  • The codes could be re-used and scaled for future analysis, as this is a recurrent project

--

--

Corina Paraschiv
Corina Paraschiv

Written by Corina Paraschiv

Mixed Methods Design Researcher and Podcaster at “Mixed Methods Research" and “Healthcare Focus”.

No responses yet