Want better accuracy for classification models? Let’s use “Attention maps”!

Ankit Kumar
Aug 11, 2019 · 2 min read

One of the major concerns of research groups and companies dealing with classification tasks in computer vision is to tackle the scenario where images belonging to different classes look very similar. As a result, the classification convolutional neural network architecture does not work very well with such datasets.

To deal with such a common yet crucial issue in supervised learning, MoonVision proposes a solution, using few-shot classification with attention mechanism, which helps the convolutional neural network architecture learn fine-grained local details of an image. As attention maps focus on the fine-grained details of an image, it helps the convolution neural architecture distinguish between the images which look very similar but belong to different categories. This potentially increases the accuracy of deep learning architectures.

Qualitative results for Cars196 dataset by using our few-shot classification with an attention mechanism.

Image for post
Image for post

Let’s see the magic of “Attention maps” for better Interpretability

Image for post
Image for post

Here we have shown where our model incorporates attention during training for CUB200 and CARS196 dataset compared to other methods that just visualize activations for different layers of the model.

Indeed, local features play a critical role in many fine-grained visual recognition tasks. Typical deep neural networks designed for image classification are good at extracting high-level global features, but the features of local details are often missing. This could limit the Attention maps in exploring local details to distinguish the difference between images from different classes. For example, without local details, the deep learning models could not learn about fine-grained details of an image that could not force a model to focus on most discriminative parts such as logos, lights, etc.. in identifying different cars.

Conclusion

Our few shot classifications with “Attention mechanism” is the state of the art image classifier which extracts high-level global features as well as low-level local features to get better results.

Check out what we do at https://www.moonvision.io/ and check our platform at https://app.moonvision.io/signup.

Moonvision

MoonVision automates visual inspection tasks with…

Ankit Kumar

Written by

Data science research intern at MoonVision

Moonvision

MoonVision automates visual inspection tasks with proprietary computer vision and deep learning tools bundled in the Moonvision Toolbox. We focus on edge deployment, active learning and training data management and enable domain experts to work with next generation technology.

Ankit Kumar

Written by

Data science research intern at MoonVision

Moonvision

MoonVision automates visual inspection tasks with proprietary computer vision and deep learning tools bundled in the Moonvision Toolbox. We focus on edge deployment, active learning and training data management and enable domain experts to work with next generation technology.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store