Convolutions to Transformers for Deep Metric Learning

Can Visual Transformers replace Convolutional Neural Networks in Deep Metric Learning?

I have read about the amazing success of Transformers like you. I got hyped with their success and wanted to understand and use them in my job as well. Most of the Computer Vision problems that I solve in my job are related to few-shot learning. Therefore, I wanted to test the transformers in this area specifically. Who doesn’t like to show a nice attention map to the client and impress them…



