Can you please provide the dataset you used ( if it can be open sourced, or any other dataset for…
Abhishek Sah
1

Hello, Abhishek.

Unfortunately our datasets can’t be open-sourced, but as we tried the unsupervised methods, it seems that pretty much any corpora of texts might work for you.

Additionally to that you could use RSS feeds from the news sites to grab some content for you and extract categories/texts from there. I’m not sure if such corpus of decent size is already available.

And finally it seems that Gensim library becomes quite popular for topic modeling. You might also want to check it out.

Best regards.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.