Scrum and Coke

Tools and Techniques for Agile Transformation. Promote and Highlight Outstanding Opensource Projects

Featured

Creating a Web API with .NET 9 to Interact with a Local Ollama AI Instance using Llama 3.1

--

Introduction

In this tutorial, you’ll learn how to create a web API using .NET 9 that works with Ollama. Ollama is a free, open-source tool that lets you run large language models (LLMs) on your own computer. LLMs are sophisticated AI programs that can generate human-like text and code, as well as perform analytical tasks. By running these models locally with Ollama, you eliminate the need for cloud services, making AI technology more accessible for everyday use. It’s especially appealing to developers, researchers, and businesses who want to keep their data secure and private.

You can download the complete tutorial source code from GitHub.

Prerequisites

  • .NET 9 or later
  • Docker running on your local machine
  • Ollama installed and running inside Docker (Listening on port 11434)
  • Visual Studio Code or any IDE of your choice
  • Scalar (for testing the API)

If you don’t have Ollama running in Docker yet, follow the instructions on the Ollama GitHub page for setting up. Be sure to run ollama run llama3.1 at the command prompt inside the container to add this model to Ollama.

Screenshot of terminal in Ollama container and command to install models

Part 1: Project Setup

We begin by creating a new ASP.NET Core Web API project. Open your terminal and run the following command:

dotnet new webapi -n OllamaLocalAI
cd OllamaLocalAI

Part 2: Adding Required NuGet Packages

Next, we need to add the Microsoft.Extensions.AI NuGet package to the project. This package allows us to interact with the AI models.

Microsoft.Extensions.AI is a set of core .NET libraries created in collaboration with developers across the .NET ecosystem, including Semantic Kernel. These libraries provide a unified layer of C# abstractions for interacting with AI services, such as small and large language models (SLMs and LLMs), embeddings, and middleware.
https://learn.microsoft.com/en-us/dotnet/ai/ai-extensions

Run the following command to add the package:

dotnet add package Microsoft.AspNetCore.OpenApi
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.AI.Ollama
dotnet add package Microsoft.Extensions.DependencyInjection
dotnet add package Scalar.AspNetCore

Part 3: Configuration

In the appsettings.json file, we configure the Ollama AI endpoint and settings for the chat. Here’s how it should look:

{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning"
}
},
"AllowedHosts": "*",
"AI": {
"Ollama": {
"Endpoint": "http://localhost:11434/",
"ChatModel": "llama3.1"
}
},
"ChatSettings": {
"SystemMessage": "Hi there! I’m your AI assistant for PickleIQ...",
"AssistantMessage": "Hi! I'm the PickleIQ Coach. How can I help?"
}
}

This file contains:

  • Ollama:Endpoint: The endpoint for the Ollama service.
  • Ollama:ChatModel: The model for the Chat service.
  • ChatSettings: Default messages for the system and assistant.

Part 4: Configuring AI Services

In Extensions.cs, we create an extension method to configure the AI services. This method sets up the chat client using the OllamaChatClient and connects to the model.

public static void AddApplicationServices(this IHostApplicationBuilder builder)
{
builder.AddAIServices();
}

private static void AddAIServices(this IHostApplicationBuilder builder)
{
var loggerFactory = builder.Services.BuildServiceProvider().GetService<ILoggerFactory>();
string? ollamaEndpoint = builder.Configuration["AI:Ollama:Endpoint"];
if (!string.IsNullOrWhiteSpace(ollamaEndpoint))
{
builder.Services.AddChatClient(new OllamaChatClient(ollamaEndpoint, builder.Configuration["AI:Ollama:ChatModel"] ?? "llama3.1"))
.UseFunctionInvocation()
.UseOpenTelemetry(configure: t => t.EnableSensitiveData = true)
.UseLogging(loggerFactory)
.Build();
}
}

The code consists of two extension methods for IHostApplicationBuilder:

  1. AddApplicationServices : This method is called on an instance of IHostApplicationBuilder and adds AI-related services.
  2. AddAIServices : This private method retrieves necessary configurations, sets up a chat client, and configures various services like function invocation, OpenTelemetry, and logging.

Part 5: Setting Up the Program

In Program.cs, we configure the services and set up the Web API pipeline:

using OllamaLocalAI.Extensions;
using Scalar.AspNetCore;

namespace OllamaLocalAI
{
public class Program
{
public static void Main(string[] args)
{
var builder = WebApplication.CreateBuilder(args);

// Add services to the container.

builder.Services.AddControllers();
// Learn more about configuring OpenAPI at https://aka.ms/aspnet/openapi
builder.Services.AddOpenApi();

// Adding application services to the builder
builder.AddApplicationServices();

var app = builder.Build();

// Configure the HTTP request pipeline.
if (app.Environment.IsDevelopment())
{
app.MapOpenApi();
app.MapScalarApiReference();
}

app.UseHttpsRedirection();

app.UseAuthorization();

app.MapControllers();

app.Run();
}
}
}

The code is a simple ASP.NET Core application that sets up various services and middleware. Here’s a breakdown of each part:

  1. Namespaces : The necessary namespaces are imported.
  2. Program Class : This class contains the main entry point of the application.
  3. Main Method : This method configures the web host builder, adds necessary services, and builds and runs the application.

Code Walkthrough

In addition to the common code in Program.cs, this code sets up a .NET application with integrated AI services and Scalar for API reference testing.

AddApplicationServices(): A custom extension method to configure application-specific services.

MapScalarApiReference(): Maps endpoints for Scalar to provide API reference testing. Scalar is used to validate and test API endpoints, ensuring AI and system integrations function as expected.

Part 6: Implementing the Chat Controller

Here’s the ChatController code:

[ApiController]
[Route("api/[controller]")]
public class ChatController : ControllerBase
{
private readonly IChatClient _chatClient;
private readonly ILogger<ChatController> _logger;
private readonly IConfiguration _configuration;

public ChatController(IChatClient chatClient, ILogger<ChatController> logger, IConfiguration configuration)
{
_chatClient = chatClient;
_logger = logger;
_configuration = configuration;
}

[HttpPost]
public async Task<IActionResult> Chat(ChatPrompt chatPrompt)
{
var messages = GroundPrompt(chatPrompt);
try
{
var response = await _chatClient.CompleteAsync(messages);
return Ok(response.Message);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing chat prompt");
return StatusCode(500, "Internal server error");
}
}

[HttpPost("chathistory")]
public async Task<IActionResult> ChatHistory(ChatPrompt chatPrompt)
{
var messages = GroundPrompt(chatPrompt);
try
{
var response = await _chatClient.CompleteAsync(messages);
messages.Add(new ChatMessage(ChatRole.Assistant, response.Message.Contents));
return Ok(messages.Select(m => new
{
Role = m.Role.ToString(),
Message = m.Contents
}));
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing chat prompt");
return StatusCode(500, "Internal server error");
}
}

private List<ChatMessage> GroundPrompt(ChatPrompt chatPrompt)
{
var systemMessage = _configuration["ChatSettings:SystemMessage"];
var assistantMessage = _configuration["ChatSettings:AssistantMessage"];
return new List<ChatMessage>
{
new ChatMessage(ChatRole.System, systemMessage ?? "Default system message."),
new ChatMessage(ChatRole.Assistant, assistantMessage ?? "Default assistant message."),
new ChatMessage(ChatRole.User, chatPrompt.Message)
};
}
}

The code defines an ASP.NET Core controller ChatController that handles chat-related operations. It uses dependency injection for services and handles both sending a single chat message and retrieving the full chat history.

Code Walkthrough

The ChatController class is an API controller that handles AI-powered chat interactions using a configured chat client (IChatClient). It includes endpoints for generating responses and retrieving chat history while ensuring proper error handling and logging.

Chat Endpoint (POST /api/Chat)

public async Task<IActionResult> Chat(ChatPrompt chatPrompt)
  • Receives a user prompt (ChatPrompt) and processes it through the AI client.
  • Calls GroundPrompt to build a structured list of messages (user, assistant, system).
  • Returns the AI-generated response (response.Message) or logs and returns an error if the process fails.

Chat History Endpoint (POST /api/Chat/chathistory)

public async Task<IActionResult> ChatHistory(ChatPrompt chatPrompt)

Similar to the Chat endpoint but maintains and returns a chat history:

  • Appends the AI response to the existing message list.
  • Returns a list of formatted chat messages, including roles (System, User, Assistant) and their content.

Part 7: Testing the API with Scalar

Now that the API is set up, we can use Scalar, a tool for testing APIs, to interact with the endpoints.

Scalar is a comparable tool to Swagger, with an interface that is slightly more developer-friendly, similar to Postman.
— Fuji Nguyen

You can send test messages through Scalar and see how the AI responds using the local Ollama instance.

How to test using Scalar

Step 1 — Run the project, select the endpoint /api/Chat/chathistory, and click Test Request. See the screenshot below for more details.

Screenshot of the chathistory endpoint

Step 2 — Select the URL https://localhost:44378, change the message to “How to dink,” and click Send. You should see the result in the screenshot below.

Example of running endpoint in Scalar

Conclusion

In this blog, we’ve walked through how to set up a Web API using .NET 9 to interact with a local Ollama AI instance running the Llama 3.1 model. We’ve also tested the API using Scalar, making it easy to test and interact with the AI. Feel free to explore the complete source code on GitHub and customize it for your own applications!

References

  1. Familiarizing .NET Developers with Ollama: Leveraging Local AI Models for System Integration
    Are you new to Ollama ?This blog introduces Ollama to .NET developers.
  2. Self-hosted AI Starter Kit
    The Self-hosted AI Starter Kit is an open Docker Compose template that bootstraps a fully featured local AI and low-code development environment.
  3. Unified AI building blocks for .NET using Microsoft.Extensions.AI
    The .NET ecosystem provides abstractions for integrating AI services into .NET applications and libraries using the Microsoft.Extensions.AI.

--

--

Scrum and Coke
Scrum and Coke

Published in Scrum and Coke

Tools and Techniques for Agile Transformation. Promote and Highlight Outstanding Opensource Projects

Fuji Nguyen
Fuji Nguyen

Written by Fuji Nguyen

DX Advocate, Open Source Contributor improving dev experience. Pickleball enthusiast who enjoys playing and coaching, sharing skills on and off the court.

No responses yet