Benjamin Marie – Medium

Benjamin Marie

Pinned

Benjamin Marie
in
Towards Data Science

Run Mixtral-8x7B on Consumer Hardware with Expert Offloading

Finding the right trade-off between memory usage and inference speed

Jan 11

Run Mixtral-8x7B on Consumer Hardware with Expert Offloading

Jan 11

Benjamin Marie
in
Towards Data Science

Run and Serve Faster VLMs Like Pixtral and Phi-3.5 Vision with vLLM

Understanding how much memory you need to serve a VLM

4h ago

Run and Serve Faster VLMs Like Pixtral and Phi-3.5 Vision with vLLM

4h ago

Benjamin Marie
in
Stackademic

FLUTE: Faster QLoRA Fine-tuning with NF4 Models

Finally, NF4 models have a reasonable latency

5d ago

FLUTE: Faster QLoRA Fine-tuning with NF4 Models

5d ago

Benjamin Marie

AdEMAMix: Achieve the Same Results as with AdamW Using Only Half as Many Training Tokens

With two momentum terms

6d ago

AdEMAMix: Achieve the Same Results as with AdamW Using Only Half as Many Training Tokens

6d ago

Benjamin Marie
in
Towards Data Science

GGUF Quantization with Imatrix and K-Quantization to Run LLMs on Your CPU

Fast and accurate GGUF models for your CPU

Sep 13

GGUF Quantization with Imatrix and K-Quantization to Run LLMs on Your CPU

Sep 13

Benjamin Marie

Better Prioritize LLM Tasks for Higher System Throughput

How to replace the naive “first-come-first-serve” rule

Sep 5

Better Prioritize LLM Tasks for Higher System Throughput

Sep 5

Benjamin Marie
in
Stackademic

Enhanced SSM Training Through Initialization with a Pre-trained Transformer

The Mamba in the Llama

Sep 4

Enhanced SSM Training Through Initialization with a Pre-trained Transformer

Sep 4

Benjamin Marie

Zamba2–1.2B: A Smaller Hybrid SSM/Transformer

Very fast and memory-efficient inference

Sep 3

Zamba2–1.2B: A Smaller Hybrid SSM/Transformer

Sep 3

Benjamin Marie

NanoFlow: Faster than vLLM and TensorRT-LLM

1.91x higher throughput

Sep 1

NanoFlow: Faster than vLLM and TensorRT-LLM

Sep 1

Benjamin Marie

End-to-end FP8 Pre-training for LLMs

Stable and memory-efficient

Aug 30

End-to-end FP8 Pre-training for LLMs

Aug 30

Benjamin Marie

Benjamin Marie

Ph.D, research scientist in NLP/AI. Medium "Top writer" in AI and Technology. Exclusive articles and all my AI notebooks on https://kaitchup.substack.com/

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams