Intel Core Ultra 5 vs. Apple M1 on LLM inference

Wei Lu
3 min readMay 24, 2024

I bought an Intel “AI PC” equipped with a Core Ultra 5 125H and 96GB of memory, which means a graphics card with 48GB of VRAM. This mini PC costs just a little more than half of my current daily computer, Apple MacBook Air M1 16G. So, I ported my Twitter/X translation project to the new platform.

Intel provides various neural network computing libraries for CPU, discrete GPU, and integrated GPU (iGPU). To minimize code modifications, I tried IPEX-LLM first.

Install Python dependencies:

pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

Set environment variables:

set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1

Only two code changes were needed:

import torch
#CHANGE 1: AutoModelForSeq2SeqLM from ipex_llm.transformers
from transformers import AutoTokenizer, pipeline, GenerationConfig#, AutoModelForSeq2SeqLM
from ipex_llm.transformers import AutoModelForSeq2SeqLM
import re
from datafile import print_err

class Translator:
models_dict = {
'nllb-1.3B': 'facebook/nllb-200-1.3B',
'nllb-3.3B': 'facebook/nllb-200-3.3B',
'nllb-distilled-600M': 'facebook/nllb-200-distilled-600M',
'nllb-distilled-1.3B'…

--

--

Wei Lu

I am a software engineer and family man living in Finland who enjoys photography and tinkering with code.