I bought an Intel “AI PC” equipped with a Core Ultra 5 125H and 96GB of memory, which means a graphics card with 48GB of VRAM. This mini PC costs just a little more than half of my current daily computer, Apple MacBook Air M1 16G. So, I ported my Twitter/X translation project to the new platform.
Intel provides various neural network computing libraries for CPU, discrete GPU, and integrated GPU (iGPU). To minimize code modifications, I tried IPEX-LLM first.
Install Python dependencies:
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
Set environment variables:
set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
Only two code changes were needed:
import torch
#CHANGE 1: AutoModelForSeq2SeqLM from ipex_llm.transformers
from transformers import AutoTokenizer, pipeline, GenerationConfig#, AutoModelForSeq2SeqLM
from ipex_llm.transformers import AutoModelForSeq2SeqLM
import re
from datafile import print_err
class Translator:
models_dict = {
'nllb-1.3B': 'facebook/nllb-200-1.3B',
'nllb-3.3B': 'facebook/nllb-200-3.3B',
'nllb-distilled-600M': 'facebook/nllb-200-distilled-600M',
'nllb-distilled-1.3B'…