Familiarizing .NET Developers with Ollama: Leveraging Local AI Models for System Integration
What is Ollama?
Ollama is a local AI solution that enables you to run large language models (LLMs) like Llama, Mistral or Gemma or Qwen
on your own infrastructure, typically in a containerized environment such as Docker. It's designed to provide developers with an easy way to integrate AI capabilities into their applications without relying on cloud-based models or APIs. By running Ollama locally, you have full control over the model, data, and interactions.
In simple terms, Ollama is like a local AI server that hosts models, such as Llama3.1
, and allows you to interact with them programmatically through HTTP-based API calls.
Ollama to a .NET Developer
As a .NET developer, you can think of Ollama as a self-hosted machine learning service similar to how you might host a REST API or a web service in your own infrastructure. Here’s how you can understand it:
1. Ollama is a Dockerized AI Service
Ollama runs inside a Docker container, making it portable and easy to deploy. You can deploy it locally (on your own machine) or on a cloud server. This is like running your .NET application inside Docker for easier deployment and management. See screenshot below for an example of Ollama running inside docker desktop.
2. LLMs (Large Language Models)
- Ollama supports LLMs, for example
Llama3.1
, which is a type of AI model trained to understand and generate human language. Llama3.1 is similar to models like OpenAI’s GPT-3, but you run it locally. - Think of LLMs as highly advanced APIs that you can call to process natural language — whether it’s generating text, answering questions, or even helping with code.
See screenshot below for a list of LLM models loaded in Ollama.
3. How Ollama Fits Into a .NET Application
- Just like how you might make API calls to external services in a .NET application (e.g., calling an external REST API or a service), Ollama allows you to make HTTP-based API calls to interact with the LLM models running locally.
- Instead of relying on a cloud-based AI service (such as OpenAI or Azure Cognitive Services), you have full control over the model and the interactions. This can be crucial for privacy, data security, and ensuring that the model can be customized to meet specific needs.
4. Ollama as a Local API Endpoint
Once you have Ollama running in a container, it exposes an HTTP endpoint (typically http://localhost:11434/
) that you can call from your .NET application. You send text-based prompts (such as a question or request) to this endpoint, and the model processes your request and returns a response. It’s just like making HTTP calls to a REST API endpoint.
For example:
- Chat Endpoint: You send a user’s query (like “Tell me about the Pickleball rules”) to the Ollama model, and it responds with a relevant answer.
- History Endpoint: Ollama can also return past interactions or allow you to keep track of conversation history.
See screenshot below for Ollama running on localhost and available via port 11434 (default)
5. Using Microsoft.Extensions.AI with Ollama
In .NET, you would typically use libraries like Microsoft.Extensions.AI
to integrate AI services into your application. With Ollama, you configure this library to communicate with the locally-running model. It’s similar to how you might set up HTTP clients or dependency injection in .NET to talk to external APIs.
For example, you can configure an HTTP client that talks to the local Ollama instance, passes chat prompts, and receives responses. Here’s where you can connect it seamlessly to your backend application, whether it’s a web API or a microservice.
Below is example C# code to connect to Ollama
using Microsoft.Extensions.AI;
namespace OllamaLocalAI.Extensions
{
public static class Extensions
{
public static void AddApplicationServices(this IHostApplicationBuilder builder)
{
builder.AddAIServices();
}
private static void AddAIServices(this IHostApplicationBuilder builder)
{
var loggerFactory = builder.Services.BuildServiceProvider().GetService<ILoggerFactory>();
string? ollamaEndpoint = builder.Configuration["AI:Ollama:Endpoint"];
if (!string.IsNullOrWhiteSpace(ollamaEndpoint))
{
builder.Services.AddChatClient(new OllamaChatClient(ollamaEndpoint, builder.Configuration["AI:Ollama:ChatModel"] ?? "llama3.1"))
.UseFunctionInvocation()
.UseOpenTelemetry(configure: t => t.EnableSensitiveData = true)
.UseLogging(loggerFactory)
.Build();
}
}
}
}
6. Benefits for .NET Developers
- Customization: You can customize the model or integrate it into your applications as needed.
- Cost: Running the model locally can save costs associated with cloud-based AI APIs.
- Data Privacy: You control the data being processed by the model, which can be important for certain use cases (e.g., HIPAA-compliant applications).
- Performance: Running AI models locally eliminates the network latency introduced by cloud-based services, making the interaction with the model faster.
Example Scenario
Imagine you’re developing a .NET-based chatbot for your company’s internal application. Instead of relying on a third-party API like OpenAI’s GPT, you could run Ollama locally to power the chatbot. By doing so, you gain several advantages:
- Data Privacy: No need to share your data with external providers.
- Cost Savings: Lower operational costs, especially with high usage.
- Customization: Fine-tune the model for specific needs related to your business.
In this scenario, you would use .NET code to send user queries (as HTTP requests) to Ollama, and then process and display the responses in your application’s UI.
If you’re interested in seeing how this works in action, check out the sample project! You can download the complete source code from my GitHub repository:
Summary
Ollama is a local AI solution that enables .NET developers to run large language models (LLMs), for exampleLlama3.1
, in a self-hosted environment using Docker. It offers a flexible, private alternative to cloud-based AI services by allowing developers to deploy and interact with AI models on their own infrastructure.
For .NET developers, Ollama can be understood as a local API endpoint that processes natural language queries. By integrating Ollama into a .NET application, developers can leverage AI-powered capabilities such as chatbots, data processing, and more, without relying on third-party cloud providers. Ollama can be configured using libraries like Microsoft.Extensions.AI
, enabling seamless communication with the model via HTTP-based API calls.
Key benefits of using Ollama include cost savings, data privacy, and the ability to customize the model for specific business requirements. With Ollama running locally, .NET developers can efficiently integrate advanced AI features into their applications while maintaining full control over the model and data.