Querying data and documents using LLM models with Kernel Memory
Previously, in our article about Semantic Kernel, we talked about how Microsoft is doing a great job of providing developers with tools and SDKs to include AI in their applications.
Semantic Kernel allows us to orchestrate calls to third-party AI services easily, such as OpenAI, Hugging Face, or Azure AI, in addition to native functions.
And now, thanks to the Kernel Memory project, we can include text documents, spreadsheets, presentations, or web pages that an LLM can exploit.
Let’s dive into the Kernel Memory project and see how to add this new framework to your project to include your organization’s documents in your LLM system.
What is Kernel Memory?
If we read the project’s GitHub repository, we could say that
Kernel Memory is a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for Retrieval Augmented Generation (RAG), synthetic memory, prompt engineering, and custom semantic memory processing.
In other words, Kernel Memory allows developers to integrate custom documents in different formats. It allows users to search or ask for information inside those documents, including citations for the documents where the information has been extracted.
Once Kernel Memory extracts the information from the documents that the developer indicates, the developers have various ways of exposing Kernel Memory.
- Serverless mode (integrated into a .NET application)
- As a web service
- As a Semantic Kernel plugin
But before continuing, let’s review the formats of the documents that, for the moment, we can use with Kernel Memory.
- Office documents. Word, Excel, PowerPoint.
- Web pages
- Images in format JPEG, PNG, and TIFF with text via OCR
- Markdown
- JSON documents
- Plain text files
As a starting point, it is not bad; it covers Microsoft Office file formats and other usual documents in which companies store their information.
IKernelMemory: The core
Both the Service mode and the Serverless mode implement the same interface, so we can perform the same operations regardless of the way we configure and start Kernel Memory.
The operations defined in this interface can be grouped into two groups according to their purpose. The first group of operations that we will use to import documents, web pages, and other formats into our service;
ImportDocumentAsync(Document document)
adds a previously created document to the current Kernel Memory instance.ImportDocumentAsync(string filePath)
adds the file located in the path.ImportDocumentAsync(Stream content)
opens a stream and adds it to Kernel Memory.ImportTextAsync(string text)
adds text.ImportWebPageAsync(string url)
adds the content of a web page to Kernel Memory.
Another group of operations that serves us to query about the documents that we have imported with the functions of the previous group;
AskAsync(string question)
returns aMemoryAnswer
We will obtain responses to our questions in a chat way, like ChatGPT responses, and a list of sources (documents) used to build the answer.SearchAsync(string query)
returns aSearchResult
This method will give us a list of documents where users will find the answer to their questions.
The difference between the Ask
and Search
operations is that SearchAsync
searches the given index for a list of relevant documents for the given query while AskAsync
searches the given index for an answer to the given query.
Other operations that developers could perform against documents are:
- Delete an existing document using
DeleteDocumentAsync
passing the document identifier as a parameter. - Check if a document is ready for usage with
IsDocumentReadyAsync
passing the document identifier as a function parameter.
As we can see, IKernelMemory
provides us with the necessary functions to integrate documents and operate on them.
Creating the service
Now that we have a general idea about Kernel Memory, we'll go deep into some concepts.
The first thing we have to do is initialize the Kernel Memory service, in our case we are going to do it in two different ways: Serverless, or Web Service. In case we choose the Serverless option, we can use the Kernel Memory extensions to use third-party services, like the following:
- OpenAI
- AzureAI
- LlamaSharp. A C# binding for models created with LLaMA
In our case, we will use OpenAI, so we will use the Microsoft.KernelMemory.AI.OpenAI
package.
In the code above, two methods are used to create the serverless and web service versions of the Kernel Memory.
In the serverless case, we add the OpenAI integration using the WithOpenAIDefaults
method, which needs a valid OpenAI API key.
And what about the service model? It has been designed to run behind your backend, so we should not expose the Kernel Memory service to public traffic without authenticating your users first.
To store our OpenAI credentials, we recommend using user secrets at the development stage, as in the code example below.
For production environments, we recommend storing credentials using environment variables.
Importing documents
Now that our Kernel Memory service is up and running, we should add the most essential ingredient to the receipt: our documents.
To add those Office suite files, web pages, or Markdown documents, we must tag the documents.
But let me clarify one thing before continuing. A Kernel Memory Document is not a 1-to-1 relation with a physical file or resource; it is a 1-to-N relationship. This means that a document could be created from three PDF files or only one.
A tag is a key-value pair that helps the AI service behind Kernel Memory to interact with our documents. Those tags could be set using a TagCollection
object or the AddTag
method available in the Document
class.
In our first example, we will use the TagCollection
object, and you will notice that it is an array of key-value entries.
In the following example, we will use the AddTag
method, which accepts two parameters: the key name and a value for the key.
And about the file addition to the Document,
you could use the following methods to add files or resources:
AddFile
Adds the file located in the path that the method receives as astring
parameterAddFiles
Adds a collection of files located in the paths that the method receives as anIEnumerable<string>
collection orstring[]
array.AddStream
Reads the content of the Stream passed as a parameter using the file name that is passed as a parameter, too.
Asking
At this time, we know how to create a Kernel Memory service, add some documents, and tag them, but we need to know how to operate with our documents.
In the code below, there is an example that shows how to extract the information included in the asking response.
Depending on the documents' count and size, the asking operation could take some time. Once the operation finishes, we will obtain a result like this:
It looks like a result was obtained after making a question to an LLM model, such as ChatGPT.
Search
The Search operation helps to give us a list of documents where we can look for the information that will solve the question we pose in our prompt.
In this case, the operation will return an object of type SearchResult
, which has properties such as the SourceName
or the SourceContentType
that indicate the document’s name and type.
But we can also delve deeper into the documents and make Kernel Memory tell us exactly which part of those documents is necessary to answer our question.
To do this, we will use the Partitions
property, an object collection of the Partition
class, a nested type of the Citation
class.
Each of these Partition
objects has properties that will help us know which part of the document they belong to is relevant for our search, such as:
Text
The content of the document partition.Relevance
A value between 0 and 1 shows the partition’s relevance against the given query.Tag
An object ofTagCollection
type with the tags associated with the document.
Conclusion
With the advancement and improvement of the LLMs and their adoption by different industries, the next step is to use these models outside of the knowledge bases used to train them, and this becomes a necessity for companies to be able to provide more utility to these models.
Initiatives such as the Knowledge Bases for AWS Amazon Bedrock or Microsoft’s own Kernel Memory open up new possibilities for developers and their companies to expand the use of the LLMs.
Documents
The code shown in this article makes use of some documents published by Globant:
- Tech trends report 2024 by Globant
- Driving change forward in the Automotive Industry
- Blockchain (Sentinel report 2022)
- How AI is changing the narrative
I invite you to visit Globant’s library and take a look at our books, reports, and insights.