Towards Better Self-Supervised Representations: I-JEPA
We delve into the intuition and architectural details and behind Meta’s I-JEPA model; A non-generative approach for self-supervised learning.
Introduction
Representation Learning as a field of research has been gaining lots of popularity over the last last couple of years.
Recent advancements in unsupervised learning especially in Natural Language Processing such as BERT, ELMO, the notorious GPT models, etc definitely come to play in drawing attraction to the field.
Representation Learning is all about focusing on learning useful representations of data for the purpose of having a strong backbone; set of features for downstream tasks.
A good way to think about this — Let’s imagine you are an artist learning how to paint. Your teacher, Pablo E̶s̶c̶o̶b̶a̶r Picasso, decides that holding your hand and directing you like a child is perhaps, a useless approach and the best way for you to learn his style is to try to recreate it.
So, he starts painting, you watch, he suddenly stops and asks you to complete some parts of the painting. He sighs, coats on another layer of paint onto your addition and proceeds to paint, only to stop…