What does OpenAI’s announcement mean for Retrieval Augmented Generation (RAG) and Vector-only Databases?

Madhukar Kumar
madhukarkumar
3 min readNov 7, 2023

--

In case you missed it , OpenAI made a series of announcements today. I plan to delve into the other significant items in a future post, but for now, let’s focus on one revolutionary announcement that eliminates the need for Vector-only databases in certain use cases — An OpenAI Retrieval tool that doesn’t require you to either create or search vectors.

Until now, if you were developing a Language Model (LLM) based application that was cognizant of your data (i.e., your company’s data behind a firewall or within a private Virtual Private Cloud), you would utilize a combination of tools such as LangChain, Llamaindex, and a vector-only database. The overall architecture would resemble something like the following.

Using LangChain and Llamaindex for Retrieval

Today, OpenAI introduced a new concept known as Assistants, which allows users to configure an architecture similar to the one above in a low/no-code manner. This eliminates the need for any Vector-only database and simplifies the entire process to just two steps. Moreover, once you’ve created an Assistant, you can access it through a few lines of code.

Furthermore, you can now send additional files to OpenAI via APIs, and you can send context up to 128K tokens, equivalent to about 300 pages of text. When you access these Assistants from your code, you can also provide access to the Assistants to up to 128 tools, including making external API calls and receiving the data back for these assistants to process.

This is what the Assistants based architecture look like -

OpenAI’s Assistant and Retrieval Tool

Here’s a crucial piece of information from OpenAI’s official announcement about OpenAI’s Retrieval tool that jumps out: “[The tool]…augments the assistant with knowledge from outside our models, such as proprietary domain data, product information, or documents provided by your users. This means you don’t need to compute and store embeddings for your documents or implement chunking and search algorithms. The Assistants API optimizes what retrieval technique to use based on our experience building knowledge retrieval in ChatGPT.”

Over the next few days, many developers will be testing this new feature, and it will be interesting to see how the use of Llamaindex and vector-only databases evolves. However, I must note that while this eliminates the need for individual/independent/citizen developers to use/purchase another vector-only database for building new applications, large enterprises still have petabytes of data in SQL, NoSQL, Binary, HDFS, and other formats. If you’re a large enterprise building a data-aware LLM application, you’ll still need a contextual database — a database that can store and retrieve different data types using hybrid search capabilities (lexical and semantic search). Nevertheless, it’s fascinating to see such rapid development from OpenAI.

In case you’re curious, here are the file types currently supported by OpenAI’s Retrieval tool.

Stay tuned as I dig deeper into these new features and speak with customers about how this impacts their overall use cases and development efforts.

In either case, with this announcement, I would be really concerned if I was a vector-only database.

--

--

Madhukar Kumar
madhukarkumar

CMO @SingleStore, tech buff, ind developer, hacker, distance runner ex @redislabs ex @zuora ex @oracle. My views are my own