Explore secure RAG techniques for content

Scott Hurrey
Box Developer Blog
Published in
7 min readOct 3, 2024
a keyboard with the a key replaced with an ai key. The key is blue and glowing.
Photo by BoliviaInteligente on Unsplash

Retrieval Automated Generation (RAG) is a technique used with Generative AI to augment Large Language Model queries with data specifically related to your AI queries. It combines the power of both retrieval-based tools and language generation models, enabling more accurate and contextually relevant responses.

At its core, RAG consists of two main components: a retriever and a generator. The retriever is responsible for retrieving relevant information from a large knowledge base or document collection, while the generator generates coherent and contextually appropriate responses based on the retrieved information.

To understand how RAG works, let’s dive into its technical details. The retriever component utilizes dense vector representations to encode both the input query and the documents in the knowledge base. These embeddings capture semantic similarities between words and sentences, allowing for efficient retrieval of relevant documents.

Once the retriever identifies potential candidate documents, it passes them to the generator component. The generator employs powerful language models like GPT-4 and Anthropic to generate responses based on the retrieved information. These language models are trained on vast amounts of text data and can produce human-like text by predicting what comes next in a given sequence.

The key advantage of using RAG in AI-powered applications lies in its ability to provide accurate and informative responses by leveraging pre-existing knowledge. By combining retrieval-based techniques with language generation capabilities, RAG can handle complex queries that require a deeper understanding beyond simple keyword matching.

Using RAG when creating AI-powered applications offers several benefits. First, it enhances the user experience by providing more precise answers tailored to specific queries. Secondly, it enables developers to leverage existing content efficiently without manually curating every possible response. This saves time and resources while maintaining accuracy.

In addition, RAG allows for continuous learning as new content becomes available. By updating or expanding the underlying document collection regularly, RAG can adapt to changing contexts and provide up-to-date responses.

With this context in mind, we at Box are widely known for our enterprise-grade security and our ability to scale and handle all of your content in a secure and permissions-integrated way. As Gen AI continues to gain ground across corporate environments, combining the power of AI with the security and reliability of Box creates a powerful way to build applications that help you, your employees, and customers to get the most value out of your content.

Now let’s talk about ways you can use your secure content in Box with modern gen AI technology to power your RAG workflows. There are three techniques that we recommend to developers as they think through how to implement AI with Box content in their corporate environment.

  • Use Box AI and let us handle the complexities of content permissioning and RAG for you
  • Use Box Platform and Box AI to retrieve responses to your prompts or the citations we identify to answer that prompt from your Box content
  • Use the Box API to index Box content yourself.

Let’s look at each of these in a bit more detail.

Box AI’s secure RAG

Relying on the Box AI API for Retrieval Augmented Generation presents a significant opportunity for developers seeking to streamline their AI workflows. By leveraging this cutting-edge technology, developers can offload intricate tasks to Box and harness its advanced capabilities.

A architecture flow diagram explaining how Box AI handles Secure RAG inside of Box
Box AI Architecture Flow

This is an opportunity for developers to realize the power of Gen AI without having to build, host, and maintain the complex architecture required for RAG. We also handle access permissions for you, alleviating the burden of ensuring end users can’t access content they shouldn’t. By entrusting these complex operations to Box’s sophisticated infrastructure while capitalizing on its array of models, developers can elevate their output quality significantly with less work.

The beauty of this approach lies in its simplicity — a concise prompt initiates a cascade of actions orchestrated by Box AI. From indexing and splitting to embeddings, vectorization of content, and seamless communication with Large Language Models (LLMs), every aspect is handled for you by Box.

One of the standout features of Box AI is our platform-agnostic approach to LLMs. We choose models specifically for their ability to provide quality results for a given use case. That said, we understand that you may prefer a different model or your specific use case might work better with a different provider. We provide developers with a way to choose their own model, by specifying one of our curated models at the individual API call level, so even one application can use more than one model. This versatility empowers developers to tailor their outputs precisely according to specific requirements or experiment with diverse approaches for optimal results.

Box Platform and Box AI for RAG

While using Box AI is a fantastic tool in your AI tool box, you may require a broader set of data to provide the context your RAG workflow requires. You can still use the Box API to make the most of your Box content.

Box Platform is our developer platform, which provides endpoints to do most of the things you can do in the Box UI. In the context of RAG, you can download files, retrieve various representations of files like text, pdf, or images, search for files, or use Metadata Query to filter files based on key value pairs.

Of course, you can also use Box AI APIs. Our AI API provides you with two ways to augment your vector store with relevant data. You can ask Box AI a question about or extract key value pairs from your Box content and use either the resulting answer or the citations we identify to bolster your vector store with relevant data.

With this technique, you can extract relevant information from cited sources and convert them into vector embeddings, serving as a foundation for your AI application without having to process your Box content manually. This allows you to gain insights from your most secure content in Box without having to move your files outside of our world-class security controls. The app will only have access to the content specifically shared with either the application or the user the application is acting as. This is quite a powerful and secure way to keep your content safe while getting the value out of it that you need.

Indexing Box content yourself

We also understand that you may not have access to Box AI or you may want to have more control over the documents you use in your application. If Box AI techniques aren’t available or sufficient for your needs, we also provide a powerful API: Box Platform. With our API, you can download files or retrieve text or PDF representations of your content. This technique gives the developer the most control over the content. Just remember, once the file leaves Box, we can no longer manage permissions and are no longer responsible for security or compliance. If you choose this technique, make sure you are taking this into account as you process your content.

If you choose to index your own files, our APIs and SDKs will provide you with the tool set you need to do so seamlessly. We have also built loaders and retrievers in several popular AI orchestration frameworks, like Langchain and LlamaIndex. Our recommended approach is to first check for a text representation. This is the easiest and least resource-intensive way to get your content. Of course files like images, videos, and scanned PDFs do not have text representations, so for these document types, you can also get PDF or image representations, or you can simply use our download endpoint to download the file locally and process it in the manner you choose.

By retrieving your files from Box and incorporating them into your vector store alongside other data sources, you enrich your AI application’s knowledge base and enhance its performance when utilizing models with RAG. Just remember to handle any privacy considerations related to accessing sensitive information stored in Box files while adhering to applicable data protection regulations within your organization.

In this blog, we talked about techniques to incorporate your Box content into your AI applications at a high level. You can let Box AI do the heavy lifting, either by letting Box AI do all the work, using Box AI as a tool to supplement your embeddings with relevant data from your Box content, or by indexing your Box files yourself. All are valid and all have their pros and cons. In the coming weeks, look for more blogs diving deeper into each of these techniques, with sample code using popular AI orchestration frameworks and data tools like Langchain, Llama Index, and Pinecone.

In order to use the Box AI API endpoints, you must be an Enterprise Plus customer. You must have an application created in the developer console with the appropriate Box AI scope, and your Box instance must have Box AI enabled.

🦄 Want to engage with other Box Platform champions?

Join our Box Developer Community for support and knowledge sharing!

--

--