Vision Transformers (ViT) for Self-Supervised Representation Learning (Part 1)

Published in

Deem.blogs

6 min readApr 11, 2022

Here I am summarizing the recent works done on Vision Transformers (ViT) in the Self-Supervised (SSL) and Unsupervised Learning field to keep you updated. Vision transformers are becoming very popular these days. They are being used in many fields, including object detection, segmentation and representation learning. Therefore, I think it is important to know what has been going on recently. Therefore, I am summarizing some of the works I found on the Internet. However, I cannot include all the details, such as experiments, because I am trying to include multiple related works together. I hope you find it useful.

I am mentioning some other work in SSL, such as MoCo or BYOL. If you are not familiar with these works, I am sharing my previous article on Self-Supervised Learning. Please take your time to read the following work since it is highly related to what is going on here.

Recent Advances in Self-Supervised and Unsupervised Representation Learning (2019~2022).

My last work on Semi-supervised learning happened to be interesting to many people. So, I decided to make a similar…

medium.com

Emerging Properties in Self-Supervised Vision Transformers

DINO is proposed by Caron et al.

Vision Transformers (ViT) for Self-Supervised Representation Learning (Part 1)

Related Article

Recent Advances in Self-Supervised and Unsupervised Representation Learning (2019~2022).

My last work on Semi-supervised learning happened to be interesting to many people. So, I decided to make a similar…

Emerging Properties in Self-Supervised Vision Transformers

Written by Ching (Chingis)