Embedding: Types, Use cases and Evaluation (Part 3 of RAG Series)
Making computers understand Text
This is part 3 of the “Retrieval-Augmented Generation (RAG) — Basics to Advanced Series”. Links to other blogs in the series are at the bottom of this blog. Taking forward from part 1 (RAG Basics) and part 2 (Chunking), in this blog we will focus on the “Embedding” component which is relevant for embedding of chunks in the source content and the query. (highlighted in Blue). Since, fundamentally the concept is similar, we will cover this together.
What is Embedding?
In the last blog (Chunking), we discussed how we can break the source content (S) and the query into small chunks using various chunking strategies.
Now, as we know computers understand numbers, so these chunks of text have to be encoded to numbers (some mathematical form) which the computers can read, understand and process. Furthermore, we also would expect numbers to ensure that there are relationships between each word/chunk with the other words/chunks as well.