Using Ollama to run local LLMs on your computer

With Ollama it is possible to run Large Language Models locally on your PC. In this post I will show you how you can install and use the software.

Sebastian Jensen

Published in

medialesson

4 min readMar 21, 2024

What is a Large Language Model?

A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word. The famous ChatGPT by OpenAI is based on a large language model, which enables users to refine and steer a converstation.

Install Ollama and download LLMs

Let’s start by downloading Ollama from the offical website ollama.com.

You will find two downloads button on the screen and by pressing one of them you are redirected to the official download page. Here you can select your operating system, in my case Windows and download the corresponding file.

If you use the link in the top right corner, called Models. You will get a list of all available Large Language Models, which can be downloaded and used locally by Ollama.

After the installation you can open a Terminal and use the ollama command. By calling ollama pull <model name> you can download the Large Language Model. I want to try Phi-2, a LLM by Microsoft.

After the download of the model is complete, we can use ollama run <model name> to start a conversation with the corresponding model. You just need to enter your prompt and the model will answer accordingly.

By typing /bye you can exit the command. If you add the --verbose parameter to the call, you will receive some additional statistics at the end of the response.

Ollama also acts a server, so we are able to write code to simulate a chat conversation. I will show you two ways how you can access the Ollama server using Python. I assume that you have already Python installed on your machine. Let’s open Visual Studio Code and create a new folder ollama. In this folder add a file called requirements.txt, which contains all the needed packages. In our case we need langchain_community and requests.

Now we create another file in the folder called main-langchaincommunity.py. This file uses the langchain_community package to connect to the Ollama server and invokes a simple command, which will be printed to the console.

If you open another Terminal window. You can switch to the created folder and call our Python script. You will see a programming joke on the console.

I will show you another approach by using the requests package. Let’s create a new file called main-api.py in our folder. On localhost:11434 the Ollama server is running and it is providing the endpoint api/generate to generate a response. We just configure the headers and the data objects and finally we are able to call our generate_reponse method in Python.

If we open the Terminal window again and call our main-api.py script you will get also the answer from the locally running Large Language Model.

Conclusion

In this post I’ve explained to you, how you can easily install Ollama on your Windows machine and use Large Language Models locally.

You will find the used code on my GitHub repository.

Using Ollama to run local LLMs on your computer

With Ollama it is possible to run Large Language Models locally on your PC. In this post I will show you how you can install and use the software.

What is a Large Language Model?

Install Ollama and download LLMs

Conclusion

Written by Sebastian Jensen