Mikhail KhrushchevinYandexYaFSDP — a tool for faster LLM training and optimized GPU utilizationLast week, we open-sourced the YaFSDP method — a new tool designed to dramatically speed up the training of large language models.Jun 171Jun 171
Mikhail KhrushchevinYandexYandex Publishes YaLM 100B. It’s the Largest GPT-Like Neural Network in Open SourceIn recent years, large-scale transformer-based language models have become the pinnacle of neural networks used in NLP tasks. They grow in…Jun 23, 20227Jun 23, 20227