Akira’s ML News #Week44, 2020

Akihiro FUJII
Analytics Vidhya
Published in
6 min readOct 31, 2020

Here are some of the papers and articles that I found particularly interesting I read in week 44 of 2020 (25 October~). I’ve tried to introduce the most recent ones as much as possible, but the date of the paper submission may not be the same as the week.

Topics

  1. Machine Learning Papers
  2. Technical Articles
  3. Examples of Machine Learning use cases
  4. Other topics

— Weekly Editor’s pickup

— Past Articles

Week 43 ⇦ Week 44(this post) ⇨ Week 45

September 2020 summary

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

1. Machine Learning Papers

— —

Multilingual text-to-text massive model

mT5: A massively multilingual pre-trained text-to-text transformer
https://arxiv.org/abs/2010.11934

They proposed mT5, a multilingual version of T5, which unifies all tasks into a text2text format and follows a pre-learning to fine-tune strategy, and mC4, a large multilingual dataset containing 101 languages. It has up to 13 billion parameters with the highest performance for various tasks.

Figure and caption taken from the above paper

Regularization using the difference in learning speed between wrong and correct labels

Early-Learning Regularization Prevents Memorization of Noisy Labels
https://arxiv.org/abs/2007.00151

They found that in a label noise situation, data with correct labels can be learned correctly, while data with wrong labels can predict correct labels at first, but later be pulled by the wrong labels and remember the data. Using this phenomenon, they proposed ESR, a regularization method using a moving average of model outputs. The results are very effective in the presence of label noise.

Figure and caption taken from the above paper

Self-supervised learning using causal graphs

REPRESENTATION LEARNING VIA INVARIANT CAUSAL MECHANISMS
https://arxiv.org/abs/2010.07922

Considering that images are constructed by a causal graph of content (animal species) and style (background, etc.), they propose a self-supervised learning RELIC that learns the images to be invariant to the style. Specifically, the system is designed to classify individual images and match their distributions so that they are invariant to style transformation by data augmentations. It was not only comparable to previous studies, but also effective in reinforcement learning.

Figure and caption taken from the above paper

Data is not just about quantity, but quality as well.

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
https://arxiv.org/abs/2009.10795

A study investigated the quality of NLP data. It was shown that data can be divided into regions that contribute to learning convergence, regions that are difficult to learn due to mislabeling and other factors, and ambiguous regions with variable confidence that contribute to generalization performance. Recently, NLP tends to emphasize quantity rather than quality, but it is suggested that it is also good to review quality.

Figure and caption taken from the above paper

Positional Dependence of Accuracy by Padding

MIND THE PAD — CNNS CAN DEVELOP BLIND SPOTS
https://arxiv.org/abs/2010.02178

The study shows that padding causes positional dependence of accuracy; for networks like ResNet that downsample with stride=2, padding pixels are not applied equally depending on the image size. (The leftmost padding is used, but the rightmost padding is not.)Just change the image size to make it padding treatment equal, and the accuracy is improved.

Figure and caption taken from the above paper

Embedding Representations in High-Dimensional Space

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
https://arxiv.org/abs/2003.01629

Normally, representation learning is aimed at making it low-dimensional, but since state representation in reinforcement learning is treated like an intermediate layer, this study was inspired by the trend of large-scale networks achieving results and made it into a high-dimensional space. It was effective in many environments.

Figure and caption taken from the above paper

Visualization through segmentation of Active Learning

Deep Active Learning for Joint Classification & Segmentation with Weak Annotator
https://arxiv.org/abs/2010.04889

For labeled data with only a few masks, they proposed a method to gradually increase the number of masked data by active learning while performing classification and segmentation at the same time, which provides better visualization performance than CAM.

Figure and caption taken from the above paper

Scaling Laws of various data domains

Scaling Laws for Autoregressive Generative Modeling
https://arxiv.org/abs/2010.14701

The study investigated the scale laws for computational resources, data amount and model size in various data domains. Thy find power law’s in all the studied domains, and the optimal model size for a domain shows a universal trend regardless of the domain.

Figure and caption taken from the above paper

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

2. Technical Articles

— — — —

Effectiveness of synthetic data

A conversation article discussing how synthetic data was used in the field. It discusses synthetic data can add diversity to the data and how they ran a loop to improve the synthetic data to a satisfactory quality while collecting the actual data.

ViT Explanation

This video explains the paper of Transformer model,ViT that beats out the CNN model. How, like CNNs, the Transformer gets global features from local features as the layers get deeper, and how the Transformer has a smaller inductive bias than CNNs and LSTMs, so that it can be used for large scale He explains that if you have a dataset, you can go beyond those., etc.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

3. Examples of Machine Learning use cases

— — — —

Using machine learning to help clean up river litter

For two Microsoft global hackathons, in 2018 and 2019, Hackathon team worked with the Ocean Cleanup, known worldwide for its innovative efforts to rid the ocean of plastics, to build a machine learning model to help quantify the amount of plastic pollution flowing down rivers en route to the ocean.

Using AI to analyze information warfare

The article discusses how natural language processing was used to analyze large amounts of news and information disseminated to respond to information warfare. It analyzes, among other things, that for months before the recent conflict between Armenia and Azerbaijan, information has been disseminated to deliberately portray one side of the country as the aggressor.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

4. Other Topics

— — — —

Interview with Competitions Grandmaster

This video interview with a Grandmaster. It is about 5 minutes long and includes how he came to play Kaggle, what it means to him to be a grandmaster, and advice for Kaggle beginners.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

— Past Articles

Week 43 ⇦ Week 44(this post) ⇨ Week 45

September 2020 summary

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Twitter, I post one-sentence paper commentary.

https://twitter.com/AkiraTOSEI

--

--