Calculating Sentence Similarity using Bert Model
For this post, we are going to use the Pre-Trained model with the HuggingFace Transformers to calculate cosine similarity scores between sentences.
In recent years, large language models (LLMs) have become increasingly popular. There are different uses for LLMs, and one of them is calculating sentence similarity.
Thanks to the HuggingFace Transformers, we can deal with our text processing tasks easier.
Bidirectional Encoder Representations from Transformers (BERT) is a transformer language model for natural language processing (NLP). It is a pre-trained model by Google. We can use the Bert model for different goals such as classification, sentence similarity or question answering. In this post, we will use Bert Model to check the similarity between sentences.
These are our steps to calculate the sentence similarities:
- From Transformers import the pre-trained Bert Model.
- Taking embedding for each sentence with the model.
- Using cosine similarity to calculate the cosine-similarity scores between sentences
Let’s start with importing the python libraries