Using Ollama Embedding services
Going local while doing deepLearning.ai “Build LLM Apps with LangChain.js” course.
DeepLearning.ai offers very good mini courses by the creators and developers of projects such as Llama, LangChain, …
In the courses such as “Build LLM Apps with LangChain.js” , you get to see a video while half the page has a jupyter notebook you can use to follow along. I like to download such notebooks and run them in my laptop while doing some minor modifications such as using LLMs running on LM Studio, GPT4ALL or Ollama.
The first issue was getting the OpenAI embedding API to work with my Ollama server listening at http://127.0.0.1:11434
While for OpenAI API for chat and completion you need to add the suffix /v1, for embeddings you need to add /api
One way to discover the URI as launching a fake HTTP server that would report in the console any new connection.
from http.server import BaseHTTPRequestHandler, HTTPServer
import logging
class RequestHandler(BaseHTTPRequestHandler):
def _send_404(self):
self.send_response(404)
self.end_headers()
self.wfile.write(b"Not Found")
def do_GET(self):
logging.info(f"GET request,\nPath: {self.path}\nHeaders:\n{self.headers}")
self._send_404()
def do_POST(self):
logging.info(f"POST request,\nPath: {self.path}\nHeaders:\n{self.headers}")
self._send_404()
def do_PUT(self):
logging.info(f"PUT request,\nPath: {self.path}\nHeaders:\n{self.headers}")
self._send_404()
def do_DELETE(self):
logging.info(f"DELETE request,\nPath: {self.path}\nHeaders:\n{self.headers}")
self._send_404()
def run(server_class=HTTPServer, handler_class=RequestHandler, port=8000):
logging.basicConfig(level=logging.INFO)
server_address = ('', port)
httpd = server_class(server_address, handler_class)
logging.info(f"Starting httpd on port {port}...\n")
try:
httpd.serve_forever()
except KeyboardInterrupt:
pass
httpd.server_close()
logging.info("Stopping httpd...\n")
if __name__ == '__main__':
from sys import argv
if len(argv) == 2:
run(port=int(argv[1]))
else:
run()
The original jupyter notebook for the embedding practice starts with loading into the process environment variables definitions from the file .env
OPENAI_API_KEY="nokey"
OPENAI_BASE_URL="http://127.0.0.1:11434/api"
This settings didn’t affect the actual URL the embedding was trying to reach. Even after passin the baseURL I could see through debugging that the baseURL was undefined in the code executed in OpenAIEmbeddings.
import { OpenAIEmbeddings } from "@langchain/openai";
const embeddings = new OpenAIEmbeddings({
apiKey: "noneed",
modelName: "nomic-embed-text",
language: "en",
baseURL: "http://127.0.0.1:11434/api"
});
I was able to force the baseURL by adding the following code.
embeddings.clientConfig.baseURL = "http://127.0.0.1:11434/api";
embeddings.clientConfig
With the following output
{
apiKey: "nokey",
organization: undefined,
baseURL: "http://127.0.0.1:11434/api",
dangerouslyAllowBrowser: true,
defaultHeaders: undefined,
defaultQuery: undefined
}
Now I was able to run the embedding without any error but I was getting back an empty vector.
One technique I use in troubleshooting, is that if I can’t get through, then I will go around.
I just imported ollama embedding class and off I was.
import {OllamaEmbeddings} from "@langchain/community/embeddings/ollama";
const ollama = new OllamaEmbeddings({ model: 'nomic-embed-text' });
await ollama.embedQuery("Hello, world!").then(console.log);
This time I got the embeddings and continue running the notebook with minor changes.
[
0.041024480015039444, 0.5697265863418579, -3.203078269958496,
-0.7495182156562805, -0.6390430927276611, 0.5711612105369568,
-0.6257678270339966, -0.6846696138381958, -0.3815970718860626,
-1.457257628440857, 0.5961878895759583, 0.5777965188026428,
0.497438907623291, 1.3378394842147827, -0.32432079315185547,
-1.3941224813461304, -0.09010568261146545, -0.9264614582061768,...