mber of topics (…on of state of the art is datasets on which these models are tested. Benchmark datasets are key for reproducibility and comparative analysis. In topic classification tasks, three popular datasets are:
Sogou news and
Yahoo! answers. They differ in corpus size and number of topics (classes) present in the data.