TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Deploying LLMs locally with Apple’s MLX framework

A technical deep dive into the new deep learning library MLX

Heiko Hotz
TDS Archive
Published in
9 min readJan 20, 2024

--

Image by author (using DALL-E 3)

What is this about?

In December 2023, Apple released their new MLX deep learning framework, an array framework for machine learning on Apple silicon, developed by their machine learning research team. This tutorial will explore the framework and demonstrate deploying the Mistral-7B model locally on a MacBook Pro (MBP). We’ll set up a local chat interface to interact with the deployed model and test its inference performance in terms of tokens generated per second. Additionally, we’ll delve into the MLX API to understand the available levers for altering the model’s behaviour and influencing the generated text.

As usual, the code is available in a public GitHub repository: https://github.com/marshmellow77/mlx-deep-dive

Why is this important?

Apple’s new machine learning framework, MLX, offers notable advantages over other deep learning frameworks with its unified memory architecture for machine learning on Apple silicon. Unlike traditional frameworks such as PyTorch and Jax, which require costly data copying between CPU and GPU, MLX maintains data in shared memory accessible to both. This design eliminates the overhead of data…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Heiko Hotz
Heiko Hotz

Written by Heiko Hotz

Generative AI Blackbelt @ Google — All opinions are my own

Responses (9)