Five Key Facts Wu Dao 2.0: The Largest Transformer Model Ever Built

The record-setting model combines some clever research and engineering methods.

Jesus Rodriguez
DataSeries

--

Image Source: https://www.forbes.com/sites/alexzhavoronkov/2021/07/19/wu-dao-20bigger-stronger-faster-ai-from-china/?sh=4a5264ed6fb2

I recently started an AI-focused educational newsletter, that already has over 100,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

It seems that every other month we have a new milestone in the race of building massively large transformer models. The trajectory is astonishing. GPT-2 set up new records by building a 1.5 billion parameters model just to be surpassed by Microsoft’s Turing NLG with 17 billion parameters. GPT-3 setup the mark at 175 billion parameters and Google’s Switch Transformer took it to 1.6 trillion parameters. Recently, Beijing Academy of Artificial Intelligence (BAAI) announced the release of Wu Dao 2.0, a transformer model that contains a mind blowing 1.75 trillion parameters…

--

--

Jesus Rodriguez
DataSeries

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...