Benjamin Marie – Medium

Benjamin Marie

Pinned

Benjamin Marie
in
Towards Data Science

Run Mixtral-8x7B on Consumer Hardware with Expert Offloading

Finding the right trade-off between memory usage and inference speed

Jan 11

Run Mixtral-8x7B on Consumer Hardware with Expert Offloading

Jan 11

Benjamin Marie

Better Prioritize LLM Tasks for Higher System Throughput

How to replace the naive “first-come-first-serve” rule

3d ago

Better Prioritize LLM Tasks for Higher System Throughput

3d ago

Benjamin Marie
in
Stackademic

Enhanced SSM Training Through Initialization with a Pre-trained Transformer

The Mamba in the Llama

4d ago

Enhanced SSM Training Through Initialization with a Pre-trained Transformer

4d ago

Benjamin Marie

Zamba2–1.2B: A Smaller Hybrid SSM/Transformer

Very fast and memory-efficient inference

5d ago

Zamba2–1.2B: A Smaller Hybrid SSM/Transformer

5d ago

Benjamin Marie

NanoFlow: Faster than vLLM and TensorRT-LLM

1.91x higher throughput

Sep 1

NanoFlow: Faster than vLLM and TensorRT-LLM

Sep 1

Benjamin Marie

End-to-end FP8 Pre-training for LLMs

Stable and memory-efficient

Aug 30

End-to-end FP8 Pre-training for LLMs

Aug 30

Benjamin Marie
in
Stackademic

Jamba 1.5: Two New Hybrid Transformers/SSM of 52B and 398B Parameters

Huge but very efficient, especially for long-context processing

Aug 29

Jamba 1.5: Two New Hybrid Transformers/SSM of 52B and 398B Parameters

Aug 29

Benjamin Marie
in
Towards Data Science

Mistral-NeMo: 4.1x Smaller with Quantized Minitron

How pruning, knowledge distillation, and 4-bit quantization can make advanced AI models more accessible and cost-effective

Aug 29

Mistral-NeMo: 4.1x Smaller with Quantized Minitron

Aug 29

Benjamin Marie

The Unexpected Impact of Code in Pre-training Data

But not too much!

Aug 28

The Unexpected Impact of Code in Pre-training Data

Aug 28

Benjamin Marie
in
Stackademic

Falcon Mamba 7B: SSM (attention-free) Model Are Getting Better

Attention-free models for faster inference

Aug 20

Falcon Mamba 7B: SSM (attention-free) Model Are Getting Better

Aug 20

Benjamin Marie

Benjamin Marie

Ph.D, research scientist in NLP/AI. Medium "Top writer" in AI and Technology. Exclusive articles and all my AI notebooks on https://kaitchup.substack.com/

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams