Souvik Mandal – Medium

Souvik Mandal

Pinned

Souvik Mandal
in
ITNEXT

CLIP: Learning Transferable Visual Models From Natural Language Supervision

CLIP (Contrastive Language Image Pre-training) paper was published in 2021 by OpenAI. This paper introduces the idea of using image caption…

Sep 25, 2023

CLIP: Learning Transferable Visual Models From Natural Language Supervision

Sep 25, 2023

Pinned

Souvik Mandal
in
ITNEXT

Attention is all you need!

Understanding Attention is All You Need paper. Detailed explanation on self-attention, cross attention and other design choices.

Jul 4, 2023

Attention is all you need!

Jul 4, 2023

Pinned

Souvik Mandal

Demystifying the Attention Logic of Transformers: Unraveling the Intuition and Implementation

Demystifying the Attention Logic of Transformers: Unraveling the Intuition and Implementation

Jun 17, 2023

Demystifying the Attention Logic of Transformers: Unraveling the Intuition and Implementation

Jun 17, 2023

Souvik Mandal
in
ITNEXT

LLM Compression Techniques

Efficient Deployment of Large Language Models through Quantization, Pruning, Distillation compression Techniques.

Jun 2

LLM Compression Techniques

Jun 2

Souvik Mandal
in
ITNEXT

Deep Learning Model Optimization: Why and How?

Why do we need model optimization? Different methods of model optimization?

Dec 9, 2023

Deep Learning Model Optimization: Why and How?

Dec 9, 2023

Souvik Mandal
in
ITNEXT

IMAGEBIND: One Embedding Space To Bind Them All

IMAGEBIND is an approach to learning joint embeddings across six different modalities: image, text, audio, depth, thermal, and IMU data.

Dec 4, 2023

IMAGEBIND: One Embedding Space To Bind Them All

Dec 4, 2023

Souvik Mandal

Vision Transformers need registers

Improving transformer attention by increasing the number of class tokens (registers).

Nov 26, 2023

Vision Transformers need registers

Nov 26, 2023

Souvik Mandal
in
ITNEXT

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

An efficient way to reduce the number of tokens, and computation requirements with little to no decrease in performance.

Nov 19, 2023

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

Nov 19, 2023

Souvik Mandal
in
ITNEXT

Masked Autoencoders As Spatiotemporal Learners

This paper shows the results and findings of applying MAE on a video dataset. The same concepts should be applicable to other 3-dimensional…

Aug 23, 2023

Masked Autoencoders As Spatiotemporal Learners

Aug 23, 2023

Souvik Mandal
in
ITNEXT

I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

This paper from Meta proposes a self-supervised technique which tries to learn highly semantic image features without relying on…

Aug 11, 2023

I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Aug 11, 2023

Souvik Mandal

Souvik Mandal

Senior AI Scientist @ Qure.ai, Ex Fractal, CSE IIT Indore, 20. LinkedIn: https://www.linkedin.com/in/mandalsouvik/

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams