Ollama-Benchmark helps buyers decide which hardware spec should be bought

Published in

aidatatools

2 min readMay 23, 2024

Not a Member? Read for FREE here.

In certain business, privacy and security are big concern when using LLMs. They don’t want to share secrets of a company, so the higher hierarchy people might ban employees using OpenAI ChatGPT or Google Gemini.

The internal IT people might consider building their own GenAI (Local LLMs) specific for their colleagues. There are around 30 Local LLMs when writing this article, please check here. List of Different Ways to Run LLMs Locally

With the local LLMs, it creates freedom to fine-tune or RAG LLM models to specifically solve one domain. It can use company internal data to fine-tune and better fit one company’s need.

However, when the company decides to build their own LLMs, such as training their own models to solve one particular problem. It usually involves how many GPUs should they buy? Which hardware spec should they consider? There is already one article talking about How Many GPUs Do You Really Need for Model Training?

As for which spec of GPU hardware piece to buy, here comes the reference. Thanks for the kindness of general public to contribute data. Ollama-Benchmark is easy to install and run benchmark. It’s open-sourced and people can see which data are collected.

pip install llm-benchmark
llm_benchmark run

The results of throughput (tokens/sec) can be found on this site. https://llm.aidatatools.com/

It makes the decision buyers evaluate the cost and performance to achieve a better cost-performance value. Three major platforms performance data, Linux, Windows, and macOS, are collected.

Ollama-Benchmark helps buyers decide which hardware spec should be bought

Written by Jason TC Chuang