Akira’s Machine Learning News — Issue #33

Akihiro FUJII

Published in

Analytics Vidhya

6 min readNov 10, 2021

Featured Paper/News in This Week.

A new dataset for self-supervised learning has been released that can be used for commercial purposes and is portrait rights friendly. As a member of the industry, I am very grateful for such a dataset, as large-scale data such as ImageNet, which is often used in academia, is usually not commercially available!
It seems that the higher the accuracy of the pre-training, the higher the accuracy of the downstream task does not necessarily become. Therefore, it appears that the scaling strategy for downstream tasks needs to be considered separately from the scaling strategy for pre-training.

— — — — — — — — — — — — — — — — — — –

In the following sections, I will introduce various articles and papers not only on the above contents but also on the following five topics.

Featured Paper/News in This Week
Machine Learning Use Case
Papers
Articles related to machine learning technology

— — — — — — — — — — — — — — — — — — –

1. Featured Paper/News in This Week

Pre-training performance does not necessarily match the performance of the downstream task. — arxiv.org

[2109.10686] Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
This is a study of the relationship between the scale of the model and the accuracy of the downstream task. The pre-training performance gets better as the model gets larger, but it does not necessarily match the performance of the downstream task. The authors proposed the DeepNarrow strategy, which narrows and deepens the model, and succeeded in speeding up the training by 40% while maintaining the performance of the downstream task.

Dataset of 1.4 million images that avoids problems such as copyrights and portrait rights — arxiv.org

[2109.13228] PASS: An ImageNet replacement for self-supervised pretraining without humans
Huge datasets such as ImageNet have problems with licensing and using photos of people without their consent. To solve this problem, the authors have collected data that is available under the CC-BY license, and released PASS, a data set for self-supervised learning that excludes people from the data. They confirmed that it avoids problems such as copyright and can be trained with MoCo, DINO, etc.

— — — — — — — — — — — — — — — — — — –

2. Machine Learning use case

Sustainable AI Systems

The Imperative for Sustainable AI Systems

This piece was the winner of the inaugural Gradient Prize. Introduction AI systems are compute-intensive: the AI…

thegradient.pub

An article discussing how to achieve a sustainable AI system. Although the amount of computation is currently increasing, the article suggests using smaller models, decentralizing the regions where computation is done (carbon emitting regions), and optimizing both software and hardware energy.

— — — — — — — — — — — — — — — — — — –

3. Machine Learning Papers

Anomaly detection method using computationally efficient with pre-trained models — arxiv.org

[2106.08265] Towards Total Recall in Industrial Anomaly Detection
Proposes PatchCore, which uses learned models for anomaly detection. It is characterized by having a core set that aggregates the feature information of each patch of training samples. Achieved SotA performance on MVTech dataset.

Contrastive learning with text and video — arxiv.org

[2109.14084] VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Proposes VideoCLIP, which performs contrastive learning on text and video. The sampling of the video is varied around the time of the sampled text, and contrastive learning is performed with high difficulty samples by clustering. The proposed method outperforms supervised learning for zero-shot inference in downstream tasks.

3D object detection method without parameters to be adjusted manually — arxiv.org

[2109.08141] An End-to-End Transformer Model for 3D Object Detection
The authors propose a 3D object detection method, 3DETR, which can be trained by End-to-End. The 3DETR deals with object detection of point clouds as a set-to-set problem like DETR, but unlike DETR, it uses only transformers and eliminates parameters that need to be adjusted manually.

Learning with Self-Teaching for Medical Images — arxiv.org

[2101.05224] Big Self-Supervised Models Advance Medical Image Classification
This is a study on how self-supervised learning on ImageNet, followed by self-supervised learning on medical images again, improves the performance of the subsequent classification task. Since medical images are often taken from multiple angles, the authors proposed Multi-Instance Contrastive Learning, which treats them as the same data.

Fine-tuning CLIP by adding small networks and residual connections — arxiv.org

[2110.04544] CLIP-Adapter: Better Vision-Language Models with Feature Adapters
They proposed CLIP-Adapter, which finetunes CLIP with less data. It adds a small network after the final layer of each image and language branch, and fine-tunes it. Another feature of the structure is that it is easy to retain the information of the original final layer by residual connection. Good performance can be achieved with less data.

Combining Transformer and CNN to build a network that runs at high speed — arxiv.org

[2110.02178] MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
The authors propose MobileViT, a high-speed network for mobile devices that combines a Transformer and a CNN. First, the local information is captured by the CNN, and then the global information is processed by the Transformer. It is 5.7% more accurate than MobileNetv3. It can be used for classification, object detection, and segmentation.

— — — — — — — — — — — — — — — — — — –

4. Technical Articles

Pytorch implementation of the famous algorithms — nn.labml.ai

labml.ai Annotated PyTorch Paper Implementations

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations…

nn.labml.ai

This is a website that introduces pytorch implementations of the core technologies of many papers, including newer ones such as gMLP, GAN, and reinforcement learning. If you are interested in a particular technology, you may want to check it out here.

— — — — — — — — — — — — — — — — — — –

5. Other Topics

20 AI People to Watch — aijourn.com

20 AI Influencers You NEED To Be Following - The AI Journal

"Artificial intelligence will reach human levels by around 2029. Follow that out further to, say, 2045, we will have…

aijourn.com

This is an article introducing 20 influential people in the AI field, with descriptions of their Twitter and LinkedIn accounts.

— — — — — — — — — — — — — — — — — — –

Other Blogs

Machine Learning 2020 summary: 84 interesting papers/articles

In this article, I present a total of 84 papers and articles published in 2020 that I found particularly interesting…

towardsdatascience.com

Recent Developments and Views on Computer Vision x Transformer

On the differences between Transformer and CNN, why Transformer matters, and what its weaknesses are.

towardsdatascience.com

Reach and Limits of the Supermassive Model GPT-3

In this blog post, I will give a technical explanation of GPT-3 , what GPT-3 have achieved , and what GPT-3 could not…

medium.com

Do Vision Transformers See Like Convolutional Neural Networks? (Paper Explained)

I will take a closer look at the differences in the obtained representations between CNN and Transformers

towardsdatascience.com

— — — — — — — — — — — — — — — — — — –

About Me

Manufacturing Engineer/Machine Learning Engineer/Data Scientist / Master of Science in Physics / http://github.com/AkiraTOSEI/

Twitter, I post one-sentence paper commentary.

Akira’s Machine Learning News — Issue #33

1. Featured Paper/News in This Week

2. Machine Learning use case

The Imperative for Sustainable AI Systems

This piece was the winner of the inaugural Gradient Prize. Introduction AI systems are compute-intensive: the AI…

3. Machine Learning Papers

4. Technical Articles

labml.ai Annotated PyTorch Paper Implementations

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations…

5. Other Topics

20 AI Influencers You NEED To Be Following - The AI Journal

"Artificial intelligence will reach human levels by around 2029. Follow that out further to, say, 2045, we will have…

— — — — — — — — — — — — — — — — — — –

Other Blogs

Machine Learning 2020 summary: 84 interesting papers/articles

In this article, I present a total of 84 papers and articles published in 2020 that I found particularly interesting…

Recent Developments and Views on Computer Vision x Transformer

On the differences between Transformer and CNN, why Transformer matters, and what its weaknesses are.

Reach and Limits of the Supermassive Model GPT-3

In this blog post, I will give a technical explanation of GPT-3 , what GPT-3 have achieved , and what GPT-3 could not…

Do Vision Transformers See Like Convolutional Neural Networks? (Paper Explained)

I will take a closer look at the differences in the obtained representations between CNN and Transformers

— — — — — — — — — — — — — — — — — — –

About Me

Written by Akihiro FUJII