Ollama-Benchmark helps buyers decide which hardware spec should be bought

Jason TC Chuang
aidatatools
Published in
2 min readMay 23, 2024

--

Not a Member? Read for FREE here.

Not a Member? Read for FREE here.ollama benchmark

In certain business, privacy and security are big concern when using LLMs. They don’t want to share secrets of a company, so the higher hierarchy people might ban employees using OpenAI ChatGPT or Google Gemini.

The internal IT people might consider building their own GenAI (Local LLMs) specific for their colleagues. There are around 30 Local LLMs when writing this article, please check here. List of Different Ways to Run LLMs Locally

With the local LLMs, it creates freedom to fine-tune or RAG LLM models to specifically solve one domain. It can use company internal data to fine-tune and better fit one company’s need.

However, when the company decides to build their own LLMs, such as training their own models to solve one particular problem. It usually involves how many GPUs should they buy? Which hardware spec should they consider? There is already one article talking about How Many GPUs Do You Really Need for Model Training?

As for which spec of GPU hardware piece to buy, here comes the reference. Thanks for the kindness of general public to contribute data. Ollama-Benchmark is easy to install and run benchmark. It’s open-sourced and people can see which data are collected.

pip install llm-benchmark
llm_benchmark run

The results of throughput (tokens/sec) can be found on this site. https://llm.aidatatools.com/

It makes the decision buyers evaluate the cost and performance to achieve a better cost-performance value. Three major platforms performance data, Linux, Windows, and macOS, are collected.

--

--

Jason TC Chuang
aidatatools

Google Certified Professional Data Engineer. He holds a PhD from Purdue University. He loves solving real-world problems and building better tools with ML/AI.