Google and NVIDIA Break MLPerf Records

Published in

SyncedReview

3 min readJul 11, 2019

“Faster, Higher, Stronger” is the Olympic motto that has pushed athletes to excellence for over a century. It’s not that different in the arena of AI model training, where the world’s top tech companies are locked in an ongoing race to advance their research performance.

Forerunners Google and NVIDIA announced today that they have set new AI training time records on the MLPerf benchmark competition — an industry-wide standard for assessing ML performance that measures how long it takes to train one of six ML models to a qualified target in the following tasks: image classification, objection detection, translation, and playing Go. The results underline the importance of speed in model training — as NVIDIA put it in their blog post: “You can’t be first if you’re not fast.”

Over 40 tech companies (Alibaba, AMD, Baidu, Facebook, Google, Intel, NVIDIA, etc.) and AI researchers from top universities participated in building the MLPerf competition.

Google Cloud TPU v3 Pods broke records for training Transformer, Single Shot Detector (SSD), and ResNet-50. In a Google Cloud blog post the company boasted it is “the first public cloud provider to outperform on-premise systems when running large-scale, industry-standard ML training workloads” of the above models.

In the task of lightweight object detection and non-recurrent translation, Google Cloud TPU v3Pods managed to train the SSD and Transformer models over 84 percent faster than the fastest on-premise systems. SSD has become a popular choice for object detection with the increasing demand for applications in medical imaging, autonomous driving, and photo editing. Synced previously reported on how Transformer model architecture at the center of NLP has inspired many significant advancements in machine translation, language modeling, and high-quality text generation.

While Google Cloud showed faster-than-ever training speed using their cloud TPUs, NVIDIA’s Tesla V100 Tensor Core GPUs equipped with NVIDIA DGX SuperPOD also performed exceptionally well, shortening ResNet-50 image classification training time to 80 seconds. To get an idea of how quickly the performance metrics are being eclipsed, just two years ago it took an NVIDIA DGX-1 system with V100 GPUs 8 hours to complete the same training.