Analytics Vidhya
Published in

Analytics Vidhya

Understanding Calculation of TF-IDF by Example

Photo by ThisisEngineering RAEng on Unsplash

TF-IDF (term frequency-inverse document frequency) is a statistical measure that evaluates how relevant a word is to a document in a collection of documents.

It plays an important role in information retrieval and text mining.

A survey conducted in 2015 shows that 83% of text-based recommender systems in digital libraries use TF–IDF.




Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Recommended from Medium

Real Time Predictions on Convolutional Neural Network

Tabular Data Augmentation with Deep Learning

Unsupervised Learning: Clustering

Election Special: Train a GPT-2 to generate Donald Trump Speeches

Taking on the ML pipeline challenge

A data scientist who is confused how to get ML in production

5 Religion That Are Strictly Following Social Distances and Mask Rule

Explaining “Blackbox” ML Models — Practical Application of SHAP

Summarizing Your Emails

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jerry An

Jerry An

Golang/Python developer and Technical Writer with a passion for open-source. To support me join Medium:

More from Medium

Machine Learning Model Deployment as APIs (flask)

SageMaker Batch Transform

Best practices to mitigate performance degradation of NER models

Developing an Arbitrary Length Token Classification Model using Nvidia Triton