Classification of text input using ML

from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.naive_bayes import MultinomialNB

Pipelining the words in bag of words(vectorize) and inputing the result into multinomial naive bayes.


text_clf = Pipeline([(‘vect’, CountVectorizer()), (‘tfidf’, TfidfTransformer()),(‘clf’, MultinomialNB()), ])

Input our sample data, and its target.


text_clf = text_clf.fit(twenty_train.data, twenty_train.target)

Predict using our test data.


predicted = text_clf.predict(docs_new)
for doc, category in zip(docs_new, predicted):
print(‘%r => %s’ % (doc, twenty_train.target_names[category]))
One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.