A completely local RAG: .NET Langchain, SQLite and Ollama with no API keys required.

3 min readMay 9, 2024

Note: Generative Artificial Intelligence tools were used to generate images and for editorial purposes.

This is the second post in a series where I share my experiences implementing local AI solutions which do not require subscriptions or API keys.

In this post, I’ll demonstrate an example using a .NET version of Langchain. We will use Ollama for inference with the Llama-3 model. For a vector database we will use a local SQLite database to manage embeddings and retrieval augmented generation. This example utilizes the C# Langchain library, which can be found here:

https://github.com/tryAGI/LangChain

The completed code of this example can be found here:

https://github.com/john-c-kane/lc-ollama-dotnet

Step 1. Install Ollama

https://ollama.com/

Step 2. Download the llama3 and all-minilm model weights

To download the weights simply open a command prompt and type “ollama pull …”.

C:\>ollama pull llama3
C:\>ollama pull all-minilm

Run the following notebook in Visual Studio Code.

This code requires the following extensions:

Polyglot

One note, for Ollama Embeddings, there is an open request to add Dimension as a parameter but this is still under development. So for now setting 384 for all-minilm has no effect however for more complex models such as nomic and mxbai, you might get unexpected results. all-minilm seems to provide the best default similarity search behavior.

#r "nuget: Langchain, *-*"
#r "nuget: Langchain.Core,*-*"
#r "nuget: LangChain.DocumentLoaders.Pdf, *-*"
#r "nuget: LangChain.Databases.Sqlite, *-*"
#r "nuget: LangChain.Providers.Ollama, *-*"
#r "nuget: LangChain.Splitters.CSharp, *-*"
#r "nuget: Ollama, *-*"

using LangChain.Databases.Sqlite;
using LangChain.DocumentLoaders;
using LangChain.Providers.Ollama;
using LangChain.Extensions;
using Ollama;

var provider = new OllamaProvider(options: new RequestOptions
{
    Stop = ["\n"],
    Temperature = 0.0f,
});
var embeddingModel = new OllamaEmbeddingModel(provider, id: "all-minilm");
//var embeddingModel = new OllamaEmbeddingModel(provider, id: "nomic-embed-text");
//var embeddingModel = new OllamaEmbeddingModel(provider, id: "mxbai-embed-large")
var llm = new OllamaChatModel(provider, id: "llama3");

var vectorDatabase = new SqLiteVectorDatabase(dataSource: "vectors.db");

var vectorCollection = await vectorDatabase.AddDocumentsFromAsync<PdfPigPdfLoader>(
    embeddingModel, // Used to convert text to embeddings
    dimensions: 1536, // Should be 1536 for TextEmbeddingV3SmallModel
    //dimensions: 384, //for all-MiniLM- 384 dimensions
    dataSource: DataSource.FromUrl("https://canonburyprimaryschool.co.uk/wp-content/uploads/2016/01/Joanne-K.-Rowling-Harry-Potter-Book-1-Harry-Potter-and-the-Philosophers-Stone-EnglishOnlineClub.com_.pdf"),
    collectionName: "harrypotter", // Can be omitted, use if you want to have multiple collections
    textSplitter: null
    behavior: AddDocumentsToDatabaseBehavior.JustReturnCollectionIfCollectionIsAlreadyExists);

const string question = "What is Harry's Address?";
var similarDocuments = await vectorCollection.GetSimilarDocuments(embeddingModel, question, amount: 5);
// Use similar documents and LLM to answer the question
var answer = await llm.GenerateAsync(
    $"""
     Use the following pieces of context to answer the question at the end.
     If the answer is not in context then just say that you don't know, don't try to make up an answer.
     Keep the answer as short as possible.

     {similarDocuments.AsString()}

     Question: {question}
     Helpful Answer:
     """).ConfigureAwait(false);

Console.WriteLine($"LLM answer: {answer}");

//optionally write out the vectordb similar documents
Console.WriteLine("Similar Documents:");
foreach(var document in similarDocuments)
{
    Console.WriteLine(document);
}

LLM answer: The address written on the letter is: Mr H. Potter The Cupboard under the Stairs 4 Privet Drive Little Whinging Surrey

A completely local RAG: .NET Langchain, SQLite and Ollama with no API keys required.

Written by John Kane