Akira’s ML news — #Week 7, 2021

Akihiro FUJII
Analytics Vidhya
Published in
8 min readFeb 12, 2021

Here are some of the papers and articles that I found particularly interesting I read in week 7 of 2021 (7 February ~). I’ve tried to introduce the most recent ones as much as possible, but the date of the paper submission may not be the same as the week.

1. Machine Learning Papers

A Practical Approach to Network Sparsification

[2102.00554] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
https://arxiv.org/abs/2102.00554

An examination of practical methods for network sparsification, this 70+ page paper discusses what is practical based on recent proposals. It says that Regularization and scheduling can achieve high performance, but require more hyperparameters and are more difficult and expensive to learn; dropout is useful as a pre-regularization for sparsification; sparsification initialization needs to be carefully considered for each.

Encrypt the data set by mixing the data.

[2010.02772] InstaHide: Instance-hiding Schemes for Private Distributed Learning
https://arxiv.org/abs/2010.02772

They propose InstaHide, which protects datasets from attackers by selecting multiple images from a dataset that includes both Web-available images such as ImageNet and own dataset, and constructing the dataset by mixing the images while inverting their signs. It is possible to protect the data without degrading the accuracy after learning.

Break InstaHide that protect data

[2011.05315] An Attack on InstaHide: Is Private Learning Possible with Instance Encoding?
https://arxiv.org/abs/2011.05315

They attack to recover the original data from data encrypted by InstaHide (https://arxiv.org/abs/2010.02772). InstaHide synthesizes and encrypts its own data by mixing it with web available data such as CIFAR10 and own private data. The encrypted data is recovered by clustering the encrypted data by creating a function to measure the similarity in web available data .

Audio word2vec

arxiv.org

[2006.11477] wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

A study of self-supervised learning to learn representations of speech. Local representations of speech are reduced to latent representations using CNNs, and masked parts such as BERT are recovered using discretized representations using codebooks. Fine-tuned using CTC to outperform existing semi-supervised methods.

A self-supervised learning method that can be trained in small batch sizes

arxiv.org

[2101.07525] Momentum² Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning
https://arxiv.org/abs/2101.07525

Instead of performing BatchNorm with large batch size statistics, they propose Momentum BatchNorm, which normalizes with the batches in history, and can be directly applied to the BYOL framework; BYOL uses a batch size of 4096, while they can achieve reasonable accuracy with a batch size of 32.

Eliminate NMS in object detection

arxiv.org

[2101.11782] Object Detection Made Simpler by Eliminating Heuristic NMS
https://arxiv.org/abs/2101.11782

Research to eliminate NMS in object detection. They attached a PSS head that calculates which anchor is optimal for each object, and train it with ranking loss. It improves the accuracy a little while eliminating NMS without slowing down the learning process.

Speed up the turbulence simulation by a factor of 80.

arxiv.org

[2102.01010] Machine learning accelerated computational fluid dynamics
https://arxiv.org/abs/2102.01010

A study of simulating turbulence by approximating the update law at each time step with ML. 10 times coarser resolution does not degrade the accuracy, and as a result, it can be accelerated up to 80 times while maintaining the accuracy.

Improve accuracy by incorporating differentiable fluid simulations in the learning loop explicitly.

[2007.00016] Solver-in-the-Loop: Learning from Differentiable Physics to Interact with Iterative PDE-Solvers
https://arxiv.org/abs/2007.00016

In machine learning approximations of fluids, this study incorporates explicitly differentiable physical simulations into the learning loop. Besides, the gradient of the simulation can be used to reduce the discretization error, which is a frequent problem.

Multiple Vision-Language tasks with a single model

arxiv.org

[2102.02779] Unifying Vision-and-Language Tasks via Text Generation
https://arxiv.org/abs/2102.02779

They proposed VL-BART and VL-T5, which can perform various image language tasks with a single model by using document generation with an autoregressive language model. The results are better than those of previous studies that require task-specific heads.

Use reinforcement learning to encourage the use of captions that do not exist in the training data.

arxiv.org

[2101.09865] ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement Learning
https://arxiv.org/abs/2101.09865

They propose ECOL-R (Encouraging Copying of Object Labels with Reinforcement Learning), which uses reinforcement learning to encourage the use of caption that do not exist in the training data in the Novel Object Captioning task, a task that uses caption that do not exist in the training data to describe images in zero-shot form. The proposed method is based on the Visual Genome Detector, which extracts locations and objects from images, and the Task Specific Object Detector, which extracts abstract label information, and uses reinforcement learning to generate footnotes that do not exist in the training data.

2. Technical Articles

Implementation of Vision Transformer in keraskeras.io

An implementation of Vision Transformer in Keras, which outperformed CNN-based models last year using only pure transformers. It explains not only the model, but also the preprocessing.

NeurIPS2020 Participation Reportfeldberlin.com

This article describes the research that the author found interesting after participating in NeurIPS2020 and its explanation. Topics include model compression, explainability, speech, and masked language models.

The History of Interpretability in Image Classificationthegradient.pub

This article describes the evolution of interpretability research in Deep Learning, introducing methods such as Grad-CAM in the order of oldest to newest, and explaining the differences between them and previous methods, making it very easy to understand.

3. Machine Learning use case

Proposing the shortest route taking into account the charging time of electric vehiclesai.googleblog.com

Most navigation systems derive the shortest route using Dijkstra-based methods, which are based on textbook-level knowledge, but these methods cannot be applied to electric vehicles, which have fewer charging points and require more charging time. This article introduces the development of an algorithm that can be applied to electric vehicles using Google’s graph neural network.

The Horror of Machine Learning Biasgizmodo.com

An interview with a machine learning researcher. While machine learning techniques such as facial recognition are used extensively around the world, they often contain inappropriate biases. The article warns that such bias can lead to the oppression of minorities and racial discrimination, and says that companies that deal with machine learning should focus more than ever on preventing such bias and ensuring safety.

4. Other Topics

Very very high performance trained ResNet50 from Microsoftwww.microsoft.com

Microsoft has released a very high performance trained ResNet50 model that is multitasking on multiple large datasets. It is multi-tasking on multiple large datasets, and shows transfer learning performance that surpasses Google’s Big Transfer and OpenAI’s Clip. Trained models are available here.

About Me

Manufacturing Engineer/Machine Learning Engineer/Data Scientist / Master of Science in Physics / http://github.com/AkiraTOSEI/

Twitter, I post one-sentence paper commentary.

--

--