Meta Is Building a 16,000-GPU AI Supercomputer — The Fastest Ever
Will it help the company improve its damaged public image?
A few days ago Meta introduced its AI Research SuperCluster (RSC), a new high-performance computer (HPC) intended to power AI technologies ranging from the latest language models to the metaverse. When it’s finished in mid-2022, Meta says, RSC will be the fastest AI supercomputer, making the company a strong contender to unveil the next round of AI breakthroughs. They plan to use RSC to train trillion-parameter NLP models, to further develop their vision and speech-recognition products, and to advance in critical areas like embodied AI and multimodal AI.
RSC notably improves Meta’s hardware capabilities. As of now, it comprises 760 Nvidia DGX A100 (6,080 GPUs) and will scale to 16,000 A100 GPUs once finished. For comparison, Nvidia Selene is a DGX A100 SuperPOD that ranks 6th in the world with 2,240 GPUs (63.4 Pflop/s). The Microsoft Azure supercomputer, built in partnership with OpenAI, has 10,000 V100 GPUs and ranks 10th in the world (30.05 Pflop/s). And Tesla’s current supercluster — to be replaced by Dojo in the coming years — consists of 720 nodes of 8 A100 GPUs (5,760 GPUs) and would rank 5th in the world (although its floating-point format precision is lower than in other HPCs, making it difficult to…