Querying data and documents using LLM models with Kernel Memory

Published in

Globant

6 min readAug 13, 2024

Previously, in our article about Semantic Kernel, we talked about how Microsoft is doing a great job of providing developers with tools and SDKs to include AI in their applications.

Semantic Kernel allows us to orchestrate calls to third-party AI services easily, such as OpenAI, Hugging Face, or Azure AI, in addition to native functions.

And now, thanks to the Kernel Memory project, we can include text documents, spreadsheets, presentations, or web pages that an LLM can exploit.

Let’s dive into the Kernel Memory project and see how to add this new framework to your project to include your organization’s documents in your LLM system.

What is Kernel Memory?

If we read the project’s GitHub repository, we could say that

Kernel Memory is a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for Retrieval Augmented Generation (RAG), synthetic memory, prompt engineering, and custom semantic memory processing.

In other words, Kernel Memory allows developers to integrate custom documents in different formats. It allows users to search or ask for information inside those documents, including citations for the documents where the information has been extracted.

Document integration in Kernel Memory. Image from Kernel Memory GitHub repository

Once Kernel Memory extracts the information from the documents that the developer indicates, the developers have various ways of exposing Kernel Memory.

Serverless mode (integrated into a .NET application)
As a web service
As a Semantic Kernel plugin

But before continuing, let’s review the formats of the documents that, for the moment, we can use with Kernel Memory.

Office documents. Word, Excel, PowerPoint.
PDF
Web pages
Images in format JPEG, PNG, and TIFF with text via OCR
Markdown
JSON documents
Plain text files

As a starting point, it is not bad; it covers Microsoft Office file formats and other usual documents in which companies store their information.

IKernelMemory: The core

Both the Service mode and the Serverless mode implement the same interface, so we can perform the same operations regardless of the way we configure and start Kernel Memory.

The operations defined in this interface can be grouped into two groups according to their purpose. The first group of operations that we will use to import documents, web pages, and other formats into our service;

ImportDocumentAsync(Document document) adds a previously created document to the current Kernel Memory instance.
ImportDocumentAsync(string filePath) adds the file located in the path.
ImportDocumentAsync(Stream content) opens a stream and adds it to Kernel Memory.
ImportTextAsync(string text) adds text.
ImportWebPageAsync(string url) adds the content of a web page to Kernel Memory.

Another group of operations that serves us to query about the documents that we have imported with the functions of the previous group;

AskAsync(string question) returns a MemoryAnswer We will obtain responses to our questions in a chat way, like ChatGPT responses, and a list of sources (documents) used to build the answer.
SearchAsync(string query) returns a SearchResult This method will give us a list of documents where users will find the answer to their questions.

The difference between the Ask and Search operations is that SearchAsync searches the given index for a list of relevant documents for the given query while AskAsync searches the given index for an answer to the given query.

Other operations that developers could perform against documents are:

Delete an existing document using DeleteDocumentAsync passing the document identifier as a parameter.
Check if a document is ready for usage with IsDocumentReadyAsync passing the document identifier as a function parameter.

As we can see, IKernelMemory provides us with the necessary functions to integrate documents and operate on them.

Creating the service

Now that we have a general idea about Kernel Memory, we'll go deep into some concepts.

The first thing we have to do is initialize the Kernel Memory service, in our case we are going to do it in two different ways: Serverless, or Web Service. In case we choose the Serverless option, we can use the Kernel Memory extensions to use third-party services, like the following:

OpenAI
AzureAI
LlamaSharp. A C# binding for models created with LLaMA

In our case, we will use OpenAI, so we will use the Microsoft.KernelMemory.AI.OpenAI package.

In the code above, two methods are used to create the serverless and web service versions of the Kernel Memory.

In the serverless case, we add the OpenAI integration using the WithOpenAIDefaults method, which needs a valid OpenAI API key.

And what about the service model? It has been designed to run behind your backend, so we should not expose the Kernel Memory service to public traffic without authenticating your users first.

To store our OpenAI credentials, we recommend using user secrets at the development stage, as in the code example below.

For production environments, we recommend storing credentials using environment variables.

Importing documents

Now that our Kernel Memory service is up and running, we should add the most essential ingredient to the receipt: our documents.

To add those Office suite files, web pages, or Markdown documents, we must tag the documents.

But let me clarify one thing before continuing. A Kernel Memory Document is not a 1-to-1 relation with a physical file or resource; it is a 1-to-N relationship. This means that a document could be created from three PDF files or only one.

A tag is a key-value pair that helps the AI service behind Kernel Memory to interact with our documents. Those tags could be set using a TagCollection object or the AddTag method available in the Document class.

In our first example, we will use the TagCollection object, and you will notice that it is an array of key-value entries.

In the following example, we will use the AddTag method, which accepts two parameters: the key name and a value for the key.

And about the file addition to the Document, you could use the following methods to add files or resources:

AddFile Adds the file located in the path that the method receives as a string parameter
AddFiles Adds a collection of files located in the paths that the method receives as an IEnumerable<string> collection or string[] array.
AddStream Reads the content of the Stream passed as a parameter using the file name that is passed as a parameter, too.

Asking

At this time, we know how to create a Kernel Memory service, add some documents, and tag them, but we need to know how to operate with our documents.

In the code below, there is an example that shows how to extract the information included in the asking response.

Depending on the documents' count and size, the asking operation could take some time. Once the operation finishes, we will obtain a result like this:

It looks like a result was obtained after making a question to an LLM model, such as ChatGPT.

Search

The Search operation helps to give us a list of documents where we can look for the information that will solve the question we pose in our prompt.

In this case, the operation will return an object of type SearchResult, which has properties such as the SourceName or the SourceContentType that indicate the document’s name and type.

But we can also delve deeper into the documents and make Kernel Memory tell us exactly which part of those documents is necessary to answer our question.

To do this, we will use the Partitions property, an object collection of the Partitionclass, a nested type of the Citation class.

Each of these Partition objects has properties that will help us know which part of the document they belong to is relevant for our search, such as:

Text The content of the document partition.
Relevance A value between 0 and 1 shows the partition’s relevance against the given query.
Tag An object of TagCollection type with the tags associated with the document.

Conclusion

With the advancement and improvement of the LLMs and their adoption by different industries, the next step is to use these models outside of the knowledge bases used to train them, and this becomes a necessity for companies to be able to provide more utility to these models.

Initiatives such as the Knowledge Bases for AWS Amazon Bedrock or Microsoft’s own Kernel Memory open up new possibilities for developers and their companies to expand the use of the LLMs.

Documents

The code shown in this article makes use of some documents published by Globant:

Tech trends report 2024 by Globant
Driving change forward in the Automotive Industry
Blockchain (Sentinel report 2022)
How AI is changing the narrative

I invite you to visit Globant’s library and take a look at our books, reports, and insights.