22 new stopword languages - 54 in total
Yay! We’re really happy to support stopword removal for 54 languages. We’ve added 22 from stopwords-json and feels it is feature complete enough to deserve a bump to version 1.0.0.
The languages supported from today
Existing languages
From before we had Afrikaans, Modern Standard Arabic, Bengali, Danish, German, English, Spanish, Farsi, Finnish, French, Hausa, Hebrew, Hindi, Indonesian, Italian, Japanese, Lugbara, Dutch, Norwegian, Polish, Portuguese, Brazilian Portuguese, Punjabi Gurmukhi, Russian, Somali, Sotho, Swedish, Swahili, Vietnamese, Yoruba, Chinese Simplified and Zulu.
New languages
The new languages added are: Armenian, Basque, Breton, Bulgarian, Catalan, Croatian, Czech, Esperanto, Estonian, Galician, Greek, Hungarian, Indonesian, Irish, Korean, Latin, Latvian, Marathi, Romanian, Slovak (Slovakian), Slovenian, Thai and Turkish.
Nice to see it is used
Every week we see that new packages includes the stopword module as part of their dependencies, 744 in total on GitHub now, and hopefully many more to come. And from npmjs.com it is installed a little under 7000 times per week, growing steadily from 0 in 2015. It’s easy to use both in Node.js and in the browser.
More flexible future?
We’re looking into the possibility to add list of custom stopwords to one of the pre-generated stopword list you are using. Hopefully it will be backwards compatible, but more about that an other time.
So for now: Happy stopword removal, and hope the new version suits you well. Shout out if you have any ideas or issues with the module.