Haseeb Ullah Khan ShinwariUnlocking the Power of Distributed Training: Scaling Deep Learning for Massive Datasets1d ago
Luhui HuinTowards Data ScienceDistributed Parallel Training: Data Parallelism and Model ParallelismHow to scale out training large models like GPT-3 & DALL-E 2 in PyTorchSep 18, 20221Sep 18, 20221
Pranay JanupalliUnderstanding Model Sharding and Model Parallelism: Scaling Large Language ModelsIn the realm of large-scale models, particularly large language models (LLMs), managing memory and computational resources efficiently is a…Jul 7Jul 7
Haseeb Ullah Khan ShinwariUnlocking the Power of Distributed Training: Scaling Deep Learning for Massive Datasets1d ago
Luhui HuinTowards Data ScienceDistributed Parallel Training: Data Parallelism and Model ParallelismHow to scale out training large models like GPT-3 & DALL-E 2 in PyTorchSep 18, 20221
Pranay JanupalliUnderstanding Model Sharding and Model Parallelism: Scaling Large Language ModelsIn the realm of large-scale models, particularly large language models (LLMs), managing memory and computational resources efficiently is a…Jul 7
Pranjal KhadkaProblem with training large neural networks and solutions devised over the yearsNeural networks are being used extensively today around the world to solve complex problems in different domains. In the recent years…Jun 61
ML Blogger9 libraries for parallel & distributed training/inference of deep learning modelsIn this blog we will cover a few basics of large model training before jumping to the list of libraries available. To skip the basics of…Oct 3, 20222
Pranjal KhadkaPaper Breakdown : Easy Scaling with Micro-Batch Pipeline ParallelismThis is a breakdown of the paper “GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism” published in 2019 by google that introduced…Jun 4