Member-only story
Learn Generative AI
Running LLM on a Local Mac Machine
Using Generative AI without the Internet
Most people access generative AI tools like ChatGPT or Gemini through a web interface or API — but what if you could run them locally?
In this article, you’ll learn how to set up your own local generative AI using existing models such as DeepSeek and Meta’s LLaMA 3.
The final result will look like the GIF shown below (note, it’s hosted localhost)
1. Download the Serving Engine
First of all, we need an LLM Serving Engine, such as Ollama or vLLM. Since I’m on a Mac Machine without any CUDA chip, I explore on Ollama.
brew install ollama
Then we can start the Ollama service
brew services start ollama
Note: this service need to continue to run, for the later Open WebUI to load the LLM models. Anyway, to stop the service, you can run the below command.
# Optional step, only if you forget the service name
brew services
brew services stop ollama