Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Hands-on Tutorials

Interpreting Semantic Text Similarity from Transformer Models

Can we visualize the context being used for search?

4 min readApr 20, 2021

--

Photo by henry perks on Unsplash

Using transformer-based models for searching text documents is awesome; nowadays it is easy to implement using the huggingface library, and results are often very impressive. Recently I wanted to understand why a given result was returned— my initial thoughts went to various papers and blog posts relating digging into the attention mechanisms inside the transformers, which seems a bit involved. In this post I test out a very simply approach to get a glimpse into the context similarities picked up by these models when doing contextual search with some simple vector math. Let’s try it out.

For the purpose of this post I’ll use a model from the sentence-transformers library which has specifically been optimized for doing semantic textual similarity searches. The model essentially creates a 1024-dimensional embedding for each sentence passed to it, and the similarity between two such sentences can then be calculated by the cosine similarity between the corresponding two vectors. Say we have two questions A and B, which get embedded into 1024-dimensional vectors A and B, respectively, then the cosine similarity between the sentences is calculated as follows:

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Mathias Gruber
Mathias Gruber

Written by Mathias Gruber

Chief Data Scientist & Full Stack Developer

No responses yet