Andreas Kunar – Medium

Andreas Kunar

Andreas Kunar

GPU-Accelerated Containers for M1/M2/M3… Macs

Apple silicon Macs with MacOS always had a major shortcoming for me — their GPUs were not useable in containers or virtual machines (VMs)…

Jun 20

GPU-Accelerated Containers for M1/M2/M3… Macs

Jun 20

Andreas Kunar

Breaking News: Run Large LLMs Locally with Less RAM and Higher Speed through llama.cpp with QuIP#

A recent update to llama.cpp enables a new “crazy-sounding, but useable” 2-bit quantization for LLMs — QuIP: Quantization with Incoherence…

Jan 11

Breaking News: Run Large LLMs Locally with Less RAM and Higher Speed through llama.cpp with QuIP#

Jan 11

Andreas Kunar

Benchmarking Apple’s MLX vs. llama.cpp

It might be a bit unfair to compare the performance of Apple’s new MLX framework (while using Python) to llama.cpp (written in C/C++ using…

Dec 23, 2023

Dec 23, 2023

Andreas Kunar

Running Mixtral AI on your Mac

Mistral AI’s new Mixtral AI model to me is a breakthrough — with its GPT3.5-like answer-quality, excellent additional French, German…

Dec 15, 2023

Dec 15, 2023

Andreas Kunar

llama.cpp Performance & Apple Silicon

llama.cpp enables running Large Language Models (LLMs) on your own machine. Their CPUs, GPUs, RAM size/speed, but also the used models are…

Dec 2, 2023

llama.cpp Performance & Apple Silicon

Dec 2, 2023

Andreas Kunar

Thoughts on Apple Silicon Performance for Local LLMs

Apple silicon, with its integrated GPUs and unified, large, wide RAM looks very tempting for AI work. Especially when using Georgi…

Nov 25, 2023

Thoughts on Apple Silicon Performance for Local LLMs

Nov 25, 2023

Andreas Kunar

Andreas Kunar

Retired IT-techie / marketeer / manager and photography-coach

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams