Knowledge Pills

Short stories filled with laughter and programming concepts.

Familiarizing .NET Developers with Ollama: Leveraging Local AI Models for System Integration

--

What is Ollama?

Ollama is a local AI solution that enables you to run large language models (LLMs) like Llama, Mistral or Gemma or Qwenon your own infrastructure, typically in a containerized environment such as Docker. It's designed to provide developers with an easy way to integrate AI capabilities into their applications without relying on cloud-based models or APIs. By running Ollama locally, you have full control over the model, data, and interactions.

In simple terms, Ollama is like a local AI server that hosts models, such as Llama3.1, and allows you to interact with them programmatically through HTTP-based API calls.

Ollama to a .NET Developer

As a .NET developer, you can think of Ollama as a self-hosted machine learning service similar to how you might host a REST API or a web service in your own infrastructure. Here’s how you can understand it:

1. Ollama is a Dockerized AI Service

Ollama runs inside a Docker container, making it portable and easy to deploy. You can deploy it locally (on your own machine) or on a cloud server. This is like running your .NET application inside Docker for easier deployment and management. See screenshot below for an example of Ollama running inside docker desktop.

Example of Ollama running in Docker Desktop

2. LLMs (Large Language Models)

  • Ollama supports LLMs, for example Llama3.1, which is a type of AI model trained to understand and generate human language. Llama3.1 is similar to models like OpenAI’s GPT-3, but you run it locally.
  • Think of LLMs as highly advanced APIs that you can call to process natural language — whether it’s generating text, answering questions, or even helping with code.

See screenshot below for a list of LLM models loaded in Ollama.

List of LLM models available from Ollama

3. How Ollama Fits Into a .NET Application

  • Just like how you might make API calls to external services in a .NET application (e.g., calling an external REST API or a service), Ollama allows you to make HTTP-based API calls to interact with the LLM models running locally.
  • Instead of relying on a cloud-based AI service (such as OpenAI or Azure Cognitive Services), you have full control over the model and the interactions. This can be crucial for privacy, data security, and ensuring that the model can be customized to meet specific needs.

4. Ollama as a Local API Endpoint

Once you have Ollama running in a container, it exposes an HTTP endpoint (typically http://localhost:11434/) that you can call from your .NET application. You send text-based prompts (such as a question or request) to this endpoint, and the model processes your request and returns a response. It’s just like making HTTP calls to a REST API endpoint.

For example:

  • Chat Endpoint: You send a user’s query (like “Tell me about the Pickleball rules”) to the Ollama model, and it responds with a relevant answer.
  • History Endpoint: Ollama can also return past interactions or allow you to keep track of conversation history.

See screenshot below for Ollama running on localhost and available via port 11434 (default)

Ollama running on localhost

5. Using Microsoft.Extensions.AI with Ollama

In .NET, you would typically use libraries like Microsoft.Extensions.AI to integrate AI services into your application. With Ollama, you configure this library to communicate with the locally-running model. It’s similar to how you might set up HTTP clients or dependency injection in .NET to talk to external APIs.

For example, you can configure an HTTP client that talks to the local Ollama instance, passes chat prompts, and receives responses. Here’s where you can connect it seamlessly to your backend application, whether it’s a web API or a microservice.

Below is example C# code to connect to Ollama

using Microsoft.Extensions.AI;

namespace OllamaLocalAI.Extensions
{
public static class Extensions
{
public static void AddApplicationServices(this IHostApplicationBuilder builder)
{
builder.AddAIServices();
}

private static void AddAIServices(this IHostApplicationBuilder builder)
{
var loggerFactory = builder.Services.BuildServiceProvider().GetService<ILoggerFactory>();

string? ollamaEndpoint = builder.Configuration["AI:Ollama:Endpoint"];
if (!string.IsNullOrWhiteSpace(ollamaEndpoint))
{
builder.Services.AddChatClient(new OllamaChatClient(ollamaEndpoint, builder.Configuration["AI:Ollama:ChatModel"] ?? "llama3.1"))
.UseFunctionInvocation()
.UseOpenTelemetry(configure: t => t.EnableSensitiveData = true)
.UseLogging(loggerFactory)
.Build();
}
}
}
}

6. Benefits for .NET Developers

  • Customization: You can customize the model or integrate it into your applications as needed.
  • Cost: Running the model locally can save costs associated with cloud-based AI APIs.
  • Data Privacy: You control the data being processed by the model, which can be important for certain use cases (e.g., HIPAA-compliant applications).
  • Performance: Running AI models locally eliminates the network latency introduced by cloud-based services, making the interaction with the model faster.

Example Scenario

Imagine you’re developing a .NET-based chatbot for your company’s internal application. Instead of relying on a third-party API like OpenAI’s GPT, you could run Ollama locally to power the chatbot. By doing so, you gain several advantages:

  • Data Privacy: No need to share your data with external providers.
  • Cost Savings: Lower operational costs, especially with high usage.
  • Customization: Fine-tune the model for specific needs related to your business.

In this scenario, you would use .NET code to send user queries (as HTTP requests) to Ollama, and then process and display the responses in your application’s UI.

If you’re interested in seeing how this works in action, check out the sample project! You can download the complete source code from my GitHub repository:

Download Complete Source Code

Example screenshot of the Web API interacting with the Llama 3.1 model in Ollama

Summary

Ollama is a local AI solution that enables .NET developers to run large language models (LLMs), for exampleLlama3.1, in a self-hosted environment using Docker. It offers a flexible, private alternative to cloud-based AI services by allowing developers to deploy and interact with AI models on their own infrastructure.

For .NET developers, Ollama can be understood as a local API endpoint that processes natural language queries. By integrating Ollama into a .NET application, developers can leverage AI-powered capabilities such as chatbots, data processing, and more, without relying on third-party cloud providers. Ollama can be configured using libraries like Microsoft.Extensions.AI, enabling seamless communication with the model via HTTP-based API calls.

Key benefits of using Ollama include cost savings, data privacy, and the ability to customize the model for specific business requirements. With Ollama running locally, .NET developers can efficiently integrate advanced AI features into their applications while maintaining full control over the model and data.

--

--

Knowledge Pills
Knowledge Pills

Published in Knowledge Pills

Short stories filled with laughter and programming concepts.

Fuji Nguyen
Fuji Nguyen

Written by Fuji Nguyen

DX Advocate, Open Source Contributor improving dev experience. Pickleball enthusiast who enjoys playing and coaching, sharing skills on and off the court.

No responses yet