…on of state of the art is datasets on which these models are tested. Benchmark datasets are key for reproducibility and comparative analysis. In topic classification tasks, three popular datasets are:
Sogou news and
Yahoo! answers. They differ in corpus size and number of topics (classes) present in the data.