Zain ul AbideenMoE vs Dense vs Hybrid LLM ArchitecturesTrain 600M MoE, Dense, Hybrid LLM Architectures.5 min read·Apr 29, 2024--2--2
Zain ul AbideenSchedule-Free Learning — A New Way to Train ModelsTraining 3 Llama models for comparison of Cosine Scheduled and Schedule-Free optimizer.5 min read·Apr 18, 2024----
Zain ul AbideenLlama-Bitnet | Training a 1.58 bit LLMWhat is 1 bit LLM and How to train 70M Llama-Bitnet?5 min read·Apr 4, 2024--2--2
Zain ul AbideenORPO Outperforms SFT+DPO | Train Phi-2 with ORPOTrain Phi-2 with ORPO with LazyOrpo5 min read·Mar 22, 2024--3--3
Zain ul AbideenMulti-GPU Training of 70B LLM with Deepspeed and FSDP+QloraTrain 70–120B LLM on 4xA100s and 2xRTX3090s (Consumer-grade GPUs)5 min read·Mar 14, 2024--1--1
Zain ul AbideenWeekly AI News | The Latest AI Updates| 3 Mar— 10 MarA quick dive into recent Generative-AI research, analyzing AI in business, and learn about this week’s recent AI tools.4 min read·Mar 11, 2024----
Zain ul AbideenHow to Train a 7B Coding Chat Model?Fine-tuning Bigcode’s new Starcoder2–7B on 100k Glaive dataset.8 min read·Mar 10, 2024----
Zain ul AbideenEverything you need to know about Google’s new Gemma 7B and 2B ModelsAlso releasing Gemma-7B-Openhermes and Gemma-2B-Openhermes5 min read·Feb 29, 2024----
Zain ul AbideenWeekly AI News | The Latest AI Updates| 19 Feb — 25 FebA quick dive into recent Generative-AI research, analyzing AI in business, and learn about this week’s recent AI tools.5 min read·Feb 25, 2024----
Zain ul AbideenWeekly AI News | The Latest AI Updates| 11 Feb — 18 FebA quick dive into recent Generative-AI research, analyzing AI in business, and learn about this week’s recent AI tools.5 min read·Feb 18, 2024----