Self-supervised learning, semi-supervised learning, pretraining, self-training, robust representations, etc. are some of the hottest terms right now in the field of Computer Vision and Deep Learning. The recent progress in terms of self-supervised learning is astounding. Towards this end, researchers at FAIR have now come up with this new paper that introduces a new method to learn robust image representations.


One of the most important goals of self-supervised learning is to learn robust representations without using labels. Recent works try to achieve this goal by combining two elements: Contrastive loss and Image transformations.

In late 2018, researchers at FAIR published a paper Rethinking ImageNet Pre-training which was subsequently presented in ICCV2019. The paper presented some very interesting results regarding pre-training. I didn’t write a post about it then, but we had a long discussion on it in our KaggleNoobs slack. Researchers at Google Research and Brain team have come up with an extended version of the same concept. This new paper not only talks about pre-training but also investigates self-training and how it compares to pre-training and self-supervised learning for the same set of tasks.


Before we dive into the details presented in…

Have you heard of meta-learning? Do you remember the time when you used pseudo labeling for a Kaggle competition? What if we combine the two techniques? Continuing the series of posts regarding semi-supervised learning, today, we will discuss the latest research paper that aims to combine meta-learning and pseudo labeling for semi-supervised learning. We won’t be discussing the basics of meta-learning here. If you don’t know about it, I would suggest reading this excellent article by Lilian Weng.


Before we dive into the paper, let’s take a step back and try to understand why we need this kind of technique…

Semi-supervised learning is finally getting all the attention it deserves. From vision-based tasks to Language Modeling, self-supervised learning has paved a new way of learning (much) better representations. This paper, SimCLR, presents a new framework for contrastive learning of visual representations.

Contrastive Learning

Before getting into the details of SimCLR, let’s take a step back and try to understand what “contrastive learning” is. Contrastive learning is a learning paradigm where we want to learn distinctiveness. We want to learn what makes two objects similar or different. …

2019 is coming to an end. The landscape of Data Science and Machine Learning has made great progress. From an overwhelming amount of papers to focus on reproducibility and interpretability, this has been an incredible year overall. But today, I am not going to talk about another research paper or anything related to Data Science and Machine Learning in general. There is a very important aspect that I want to talk about: Community Building.

Note: This isn’t an article about how to get started with Kaggle or how to learn Machine learning by doing, etc. …

Object Detection has come a long way. From trivial computer vision techniques for object detection to advanced object detectors, the improvements have been amazing. Convolutional Neural Networks (CNNs) have played a huge role in this revolution. We want our detector to be as accurate as possible as well as fast enough to run in real-time. These two aspects have a trade-off and most of the detectors have proven to be doing well only on one metric, either the accuracy or the speed. Generally, more accurate detectors have found to be more compute demanding which isn’t the ideal scenario, especially when…

2019 has been the year where a lot of research has been focused on designing efficient deep learning models, self-supervised learning, learning with a limited amount of data, new pruning strategies, etc. Although self-training isn’t something new, this latest paper from Google Brain team using this approach not only surpasses the top-1 ImageNet accuracy of SOTA models by 1%, it also shows that the robustness of a model also improves.

What is self-training?

Self-training is one of the simplest semi-supervised methods. The main idea is to find a way to augment the labeled dataset with the unlabeled dataset, after all, getting labeled data…

In recent years, we have witnessed the remarkable achievements of CNNs. Iterative improvements for a task require bigger models and more computation. However, bigger models with huge memory footprints and computation requirements prevent the deployment of these models on mobiles and edge devices. This paper aims to find a solution to this problem.

Why deployment on mobiles and edge devices is hard?

Mobiles and edge devices are resources constrained. The constraints mainly come from three aspects:

  • Model size
  • Run-time memory
  • Number of computing operations(FLOPS)

To get an idea of “why”, let’s take VGG-16 as an example. The model has 138 million parameters, consumes more than 500MB and requires more…

In late 2015, a team at Google came up with a paper “Rethinking the Inception Architecture for Computer Visionwhere they introduced a new technique for robust modelling. This technique was termed as “Label Smoothing”. Since then this technique has been used in many state-of-the art-models including image classification, language translation and speech recognition. Despite its widespread usage, label smoothing is poorly understood and it is hard to answer why and when label smoothing works. …

Since AlexNet won the 2012 ImageNet competition, CNNs (short for Convolutional Neural Networks) have become the de facto algorithms for a wide variety of tasks in deep learning, especially for computer vision. From 2012 to date, researchers have been experimenting and trying to come up with better and better architectures to improve models accuracy on different tasks. Today, we will take a deep dive into the latest research paper, EfficientNet, which not only focuses on improving the accuracy, but also the efficiency of models.

Why does scaling even matter?

Before discussing “What the heck scaling means?”, the relevant question is: Why does scaling matter at…

Aakash Nain

Research Engineer, Machine Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store