Vector Search for Newborns 👶

The concept of vectors has emerged as a powerful tool in data storage and retrieval. Here’s an introduction to Vectors and Vector Search.

Sadiqur Rahman
Insider Engineering
5 min readMar 27, 2024

--

Let’s imagine you have a movie database. Where a single data point looks like this:

{
"title": "I Am Legend",
"director": "Francis Lawrence",
"year": "2007",
"plot": "In a post-apocalyptic world, a man named Robert Neville, along with his loyal dog, struggles to survive amidst a city populated by mutants infected by a deadly virus.",
…
}

Say you are looking for a movie, but you don’t know how to search for it. All you remember is, a man with his dog trying to survive in a post-apocalyptic world. If you run the following query, you will probably get thousands of movies:

SELECT * FROM movies WHERE plot like “%man%”;

If you run the following query you might not find any:

SELECT * FROM movies WHERE plot like “%man with his dog%”;

How great would it be if you could search as if you are talking to a person, “a movie where a man with his dog trying to survive in a post-apocalyptic world”.

In short, that is exactly what vector search does.

In the world of data storage and retrieval, the concept of vectors has emerged as a powerful tool, transforming how we perceive and interact with information.

But what exactly is a vector?

In its essence, a vector is a numeric representation of data, similar to plotting points on a coordinate system. Through a process known as encoding, raw data is transformed into these embedded vectors, unlocking a world of possibilities in semantic data storage.

To better understand, let’s consider this analogy: imagine we assign the word “king” to an encoder, which returns a coordinate, let’s say (3,4). Similarly, “man” might be represented as (3,5), and “ruler” as (4,3). As you can see, the vector for “king” lies close to both “man” and “ruler” in this hypothetical coordinate space.

This proximity illustrates the semantic relationships encoded within vectors, offering a semantic understanding of data.

In practical applications, vectors often reside in high-dimensional spaces, with dimensions ranging into the thousands. For instance, MongoDB employs a 1536-dimensional vector space for storage, exemplifying the scalability and versatility of this approach.

Here is a glance to a real data point:

{
_id: ObjectId('65367ce657b8cf01b51e89b0'),
content: "Spagetti Western (Spaghetti Western), 1960 - 1975 yılları arasında Avrupa’da…",
embedding: Array (1536)
0: -0.013757396
1: -0.030196106
2: 0.0071230233
3: -0.03616015
...
1535: -0.015799705

}

At the heart of vector search are similarity functions. These functions are like tools that help us understand how similar two vectors are. The Euclidean Distance, for instance, measures the straight-line distance between vectors. It’s great for dense data, like when we’re comparing images.

On the other hand, the Cosine similarity looks at the angle between vectors, focusing more on their orientation than their size. This makes it useful for comparing sparse data, such as textual themes.

Lastly, the Dot Product combines both angle and size, providing a balanced way to analyze sparse data.

With the advancement of Large Language Models (LLM) (e.g. OpenAI, Gemini), vector search is unfolding unprecedented capabilities in semantic understanding and context-aware analysis.

Vector Search Diagram

Here is a diagram to illustrate vector search. First of all, text is sent to an embedding model. The model returns a numerical impression (multidimensional coordinate) of the text. Then it saves it to the database along with the content, where the outcome document looks very similar to the one shared a few paragraphs above.

While querying, the query text is embedded first. So, we have a “multidimensional coordinate” known as a vector. This vector is sent to the database to find data points that are close to the coordinates. As the database returns the resulting data points (documents), it also provides similarity scores (how close each resulting documents are to the query text.)

Before we wrap it up, let’s look at an implementation of vector search. We all know about ChatGPT. It can pretty much answer any question you ask. But you also might have noticed that ChatGPT does not have the latest information (last training was in 2022). What if you could somehow connect the latest information with GPT? Using vector search, we can easily do it by implementing Retrieval-Augmented Generation (RAG).

If you remember how we query data from the vector database, we do one additional step here. In simple words, we basically tell LLM, “Here is a question, and these text documents contain the answer. Please answer the question based on the provided texts”. LLM takes your prompt and generates an answer. This method is known as RAG.

That was all about vector search 101. If you have learned something, don’t forget to celebrate and thank yourself 🥳🎉 Wish you happy learning! Until the next time, take care 👋

I hope you enjoyed this article. If you have any questions, please feel free to contact me on LinkedIn or comment below.

Follow us on the Insider Engineering Blog to read more about our AWS solutions at scale and engineering stories. Here are more stories you may enjoy.

References:

  1. Vector Search: Powering the Next Generation of Applications (https://www.youtube.com/watch?v=H8EC002zS-0)
  2. Vector Search RAG Tutorial — Combine Your Data with LLMs with Advanced Search (https://www.youtube.com/watch?v=JEBDfGqrAUA)
  3. Learn LangChain.js — Build LLM apps with JavaScript and OpenAI (https://www.youtube.com/watch?v=HSZ_uaif57o)
  4. Text Summarization Unleashed: Novice to Maestro with LLMs and Instant Code Solutions (https://sourajit16-02-93.medium.com/text-summarization-unleashed-novice-to-maestro-with-llms-and-instant-code-solutions-8d26747689c4)

--

--