Member-only story
Choose the Right One: Evaluating Topic Models for Business Intelligence
Python tutorial for evaluating top-notch bigram topic models in customer email classification
Topic models are used in businesses to classify brand-related text datasets (such as product and site reviews, surveys, and social media comments) and to track how customer satisfaction metrics change over time.
There is a myriad of recent topic models one can choose from: the widely used BERTopic by Maarten Grootendorst (2022), the recent FASTopic presented at last year’s NeurIPS, (Xiaobao Wu et al.,2024), the Dynamic Topic Model by Blei and Lafferty (2006), or a fresh semi-supervised Seeded Poisson Factorization model (Prostmaier et al., 2025).
For a business use case, training topic models on customer texts, we often get results that are not identical and sometimes even conflicting. In business, imperfections cost money, so the engineers should place into production the model that provides the best solution and solves the problem most effectively. At the same pace that new topic models appear on the market, methods for evaluating their quality using new metrics also evolve.
This practical tutorial will focus on bigram topic models, which provide more relevant information and identify better…