New DeepMind Unsupervised Image Model Challenges AlexNet

Jun 11 · 3 min read

While supervised learning has tremendously improved AI performance in image classification, a major drawback is its reliance on large-scale labeled datasets. This has prompted researchers to explore the potential of unsupervised learning and semi-supervised learning — techniques that forego data annotation but have their own drawback: diminished accuracy.

A new paper from Google’s UK-based research company DeepMind addresses this with a model based on Contrastive Predictive Coding (CPC) that outperforms the fully-supervised AlexNet model in Top-1 and Top-5 accuracy on ImageNet.

CPC was introduced by DeepMind in 2018. The unsupervised learning approach uses a powerful autoregressive model to extract representations of high-dimensional data to predict future samples. Researchers trained a model — ResNet in this paper — to make predictions from unlabeled data and then used a contrastive loss function to evaluate the quality of these predictions and build a high-quality unsupervised pre-trained feature representation. Originally initiated on speech tasks, DeepMind researchers have also demonstrated CPC’s efficacy on image, text, and in reinforcement learning.

A major contribution of this paper is an improved CPC architecture that enables the capture of more useful representations from unlabeled data. Specifically, researchers enlarged the size of an original 23-block ResNet 101 model to a 46-block ResNet 170 model and applied a layer normalization technique to improve training efficiency. They designed a challenging task to pretrain the model and added data augmentation to increase the difficulty of the task.

Researchers designed two methods to train a CPC model attached with a linear classifier in a semi-supervised manner: Train the CPC feature extractor on a unlabeled dataset to get a fixed parameter and then optimize an attached classifier using a small amount of labeled data; or train both the extractor and classifier on top of the unlabeled dataset and then fine-tune the parameters of the entire network.

Experiment results showed that a linear classifier trained on CPC-extracted features from the ILSVRC ImageNet competition dataset images obtained 61.0 percent Top-1 and 83.0 percent Top-5 accuracies, outperforming the AlexNet score of 59.3 percent and 81.8 percent respectively.

Given 13 labeled images per class, DeepMind’s CPC model outperformed state-of-the-art semi-supervised methods by 10 percent in Top-5 accuracy, and supervised methods by 20 percent.

Researchers also suggested that the CPC model’s unsupervised representation can transfer well to other downstream tasks. Experiment results showed that the best-performing CPC model attached to a Faster-RCNN image detection network was only 2.6 percent short of the accuracy achieved by a fully-supervised ResNet.

While still at an early stage, DeepMind’s continuing research and development on unsupervised learning might one day enable the use of massive amounts of unlabeled data to build a machine-driven intelligent world of the future.

Read the paper Data-Efficient Image Recognition with Contrastive Predictive Coding on .

Journalist: Tony Peng | Editor: Michael Sarazen

is out!
Purchase a Kindle-formatted report on .
Apply for to get a complimentary full PDF report.

Follow us on Twitter for daily AI news!

We know you don’t want to miss any stories. Subscribe to our popular to get weekly AI updates.


We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.


Written by


AI Technology & Industry Review — | Newsletter: | Become Synced Insight Partner: | Twitter: @Synced_Global


We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.