Published in


New Study Suggests Self-Attention Layers Could Replace Convolutional Layers on Vision Tasks

Nowhere has AI experienced greater development or breakthroughs in recent years than in the field of natural language processing (NLP) — and “transformers” are the not-so-secret new technology behind this revolution. The key difference between transformers and traditional methods such as recurrent neural networks or convolutional neural networks is that transformers can simultaneously attend to every word of an input text. Transformers’ impressive performance across a wide range of NLP tasks is enabled by a novel attention mechanism which captures meaningful inter-dependencies between words in a sequence by calculating both positional and content-based attention scores.

Inspired by the performance of attention mechanisms in NLP, researchers have explored the possibility of applying them to vision tasks. Google Brain Team researcher Prajit Ramachandran proposed that self-attention layers could completely replace convolutional layers on vision tasks as well as achieve state-of-the-art performance. To confirm this theory, researchers from Ecole Polytechnique Federale de Lausanne (EPFL) put forth theoretical and empirical evidence which indicates that self-attention layers can indeed achieve the same performance as convolutional layers.

From a theoretical perspective, the researchers used constructive proof to show that a multi-head self-attention layer can simulate any convolutional layer.

The researchers set the parameters of a multi-head self-attention layer so that it could act like a convolutional layer and conducted a series of experiments to validate the applicability of the proposed theoretical construction, comparing a fully attentional model comprising six multi-head self-attention layers with a standard ResNet18 on the CIFAR-10 dataset.

Test accuracy on CIFAR-10

In the tests the self-attention models performed reasonably well except in learned embeddings with content-based attention — this mainly due the increased number of parameters. The researchers however confirmed that with theoretical and empirical support any convolutional layer can be expressed by self-attention layers and the fully-attentional models can learn to combine local behavior and global attention based on input content.

The paper On the Relationship Between Self-Attention and Convolutional Layer is on arXiv.

Author: Hecate He | Editor: Michael Sarazen

Thinking of contributing to Synced Review? Sharing My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.

We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!

2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.




We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

Recommended from Medium

3 Ways Machine Learning Can Help Your Business Identify Customer Pain Points

Industry 4.0 Challenges and Opportunities for Your Business

4 AI-Based Chatbot Examples to Use in Different Industries

How human augmentation benefits the media and entertainment market

Google ‘BigBird’ Achieves SOTA Performance on Long-Context NLP Tasks

Leap in Second-Order Optimization: Shampoo Runtime Boosted 40%

Intrinsic Motivation and Open Ended Learning

Artificial Intelligence Has An Implicit Bias Diversity Dilemma

Implicit Bias in Artificial Intelligence Algorithms

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


AI Technology & Industry Review — | Newsletter: | Share My Research | Twitter: @Synced_Global

More from Medium

Sapienza U & OpenAI Propose Explanatory Learning to Enable Machines to Understand and Create…

Machine Translation with NeuralSpace

Inside Meta’s New Architecture for Build AI Agents that Can Reason Like Humans and Animals

What is Relational Machine Learning?