PinnedBenjamin MarieinTowards Data ScienceMistral 7B: Recipes for Fine-tuning and Quantization on Your ComputerCheap supervised fine-tuning with an impressive LLMOct 26, 20234Oct 26, 20234
PinnedBenjamin MarieinTowards Data ScienceRun Mixtral-8x7B on Consumer Hardware with Expert OffloadingFinding the right trade-off between memory usage and inference speedJan 113Jan 113
Benjamin MariePiccolo2: Multitask Hybrid Training for Text EmbeddingsExploiting datasets from different types of tasks for training better text embeddings1d ago1d ago
Benjamin MarieSparse Llama: 70% Smaller, 3x Faster, and Full AccuracyPruning and short pre-trainingMay 171May 171
Benjamin MarieRWKV-6: Attention-free and State-of-the-art 7B LLMEspecially good for multilingual tasksMay 15May 15
Benjamin MarieFine-tune Tiny Chat Models with Apple OpenELM and ORPOCan we make a good chat model with a 270M LLM?May 12May 12
Benjamin MarieinTowards Data ScienceTurn Llama 3 into an Embedding Model with LLM2VecRAG with Llama 3 for the generation and the retrievalMay 34May 34
Benjamin MarieinTowards Data ScienceJamba: The New Hybrid Transformer/MambaFaster and better than the transformer but more difficult to trainApr 30Apr 30
Benjamin MarieEstimate the Memory Consumption of LLMs for Inference and Fine-tuningA close look at the memory consumption of Command-R+, Mixtral-8x22B, and Llama 3 70BApr 272Apr 272