Building QA Bot with Ollama: Local LLM Platform

Amal
ScrapeHero
Published in
5 min readNov 28, 2023

What is Ollama?

Navigating the complexity of implementing open-source large language models on personal computers used to be a formidable task.

Addressing technical settings, environmental management, and the demand for extensive storage space created a complex landscape. Enter Ollama, a groundbreaking solution that streamlines the entire process.

Ollama is a game-changing solution that makes running large language models on personal computers easy. It handles all the technical details, so users don’t have to deal with complexity.

Ollama simplifies the setup, optimizes computer resources, and turns the once-challenging task of running models like LLaMA2 into a straightforward experience.

Installation & Server Setup

Linux:

curl https://ollama.ai/install.sh | sh

Mac:

https://ollama.ai/download/Ollama-darwin.zip

As of now, the link for the Windows Installation is not yet updated.

After installation, go to your Terminal and see if the command ‘ollama’ works.

Python Dependencies:

pip install langchain faiss-cpu

Remove Ollama Service & Remove models:

# Remove Service

sudo systemctl stop ollama
sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service

# Remove models
sudo rm -r /usr/share/ollama
sudo userdel ollama

Or follow this link.

About Dataset

Data is scraped from several news providers to get news information such as Title & Content. It is then processed by ScrapeHero’s Machine Learning algorithm to get Sentiment of the article, Category of the News etc.

If you want to scrape news, use ScrapeHero News API to get News Sources, Date, Sentiment, categories, or any other search keywords.

Find the other listed APIs on Scrapehero Cloud.

Sample Dataset:

Document Question Answering using Ollama and Langchain

We will start RAG (Retrieval Augmented Generation) with the help of Ollama and Langchain Framework.

Load Data and Split the Data Into Chunks:

from langchain.document_loaders.csv_loader import CSVLoader

loader = CSVLoader(file_path = “data.csv”, encoding= “utf-8”)
data = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0
all_splits = text_splitter.split_documents(data)

Vectorizing the chunks and saving the embedding:

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

hugg_embeddings = HuggingFaceEmbeddings(model_name= “BAAI/bge-base-en-v1.5”)
faiss_retriever = FAISS.from_documents(documents=all_splits, embedding= hugg_embeddings)
faiss_retriever.save_local(“tesla_news_index”)

To later load it, use:

faiss_retriever = FAISS.load_local(“tesla_news_index”, hugg_embeddings).as_retriever(search_kwargs={“k”: 5})

Let’s design a prompt template for the model and limit the model to only provide us a summary within 3 sentences.

from langchain import PromptTemplate

# Prompt
template = """Use the following pieces of context to answer the question at the end.
If you don’t know the answer, just say that you don’t know, don’t try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
{context}
Question: {question}
Helpful Answer:"""

QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],
template=template,
)

Before you load the model, go to terminal and type

ollama run mistral

Check if it is successfully loaded on disk by,

ollama list

To use other models, check out their Model Library.

Load the LLM model:

from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = Ollama(model=”mistral”, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

Running RAG to retrieve the similar document from the FAISS vector store.

from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(llm,
retriever=faiss_retriever,
chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

Analysis On Tesla

query = “Has Tesla been impacted by the economic downturn?”
result = qa_chain({“query”: query})

Response:

Tesla’s share of the US electric vehicle market slipped to a new low in Q3, indicating that the company is facing challenges in the market. Despite this, Tesla continues to be optimistic about its future and has set a goal to deliver 1.8 million vehicles in 2023. However, the company has also cut EV prices again following production and delivery downturns, which may have contributed to the decline in the share of the US electric vehicle market. Tesla’s shares fell after slashing Model 3 and Model Y prices in the US, indicating that the company is facing financial pressure due to the economic downturn

Let’s try something else:

query = “Let me about the latest updates of Tesla CyberTruck”

Response:

Based on the given context, it can be inferred that the latest update for Tesla Cybertruck is related to the rollout of FSD Beta, which is a positive development. However, there have been issues with the fundamental design of the Cybertruck, as revealed by recent documents leaks, which is a negative sentiment. Additionally, there has been no information about the status of Cybertruck deliveries, but Tesla has announced that they will finally start on November 30, which is also a positive sentiment.

Last Question:

query = “What are the drawbacks of owning a Tesla?”

Response:

Based on the provided context, some of the drawbacks of owning a Tesla include lower range in the cheaper Model S and Model X, concerns about reliability due to lockouts and trapped owners, dissatisfaction with Tesla Vision among early Highland owners, and negative sentiment towards Tesla’s recent recall. Additionally, there are reports of Tesla’s cheapest ever EV being almost half the price of a Model 3, indicating potential cost concerns for some buyers. Despite these drawbacks, there are also positive aspects to owning a Tesla, such as the life-cycle ownership cost equivalent to the cheapest car in America and Tesla’s plans to double down on innovative features like yoke steering and RGB lighting in their new Model 3.

Conclusion:

Unlock the potential of Ollama, a robust tool designed to harness the capabilities of open-source large language models. While still in its early stages, Ollama shows great promise for future advancements.

Anticipate upcoming enhancements such as the integration of additional models, the development of a user-friendly interface, and the introduction of exciting features.

The above-mentioned code is available in my Github Repository.

Hope you learned something new today, Happy Learning!

--

--