PinnedSouvik MandalinITNEXTCLIP: Learning Transferable Visual Models From Natural Language SupervisionCLIP (Contrastive Language Image Pre-training) paper was published in 2021 by OpenAI. This paper introduces the idea of using image caption…Sep 25, 2023Sep 25, 2023
PinnedSouvik MandalinITNEXTAttention is all you need!Understanding Attention is All You Need paper. Detailed explanation on self-attention, cross attention and other design choices.Jul 4, 20231Jul 4, 20231
PinnedSouvik MandalDemystifying the Attention Logic of Transformers: Unraveling the Intuition and ImplementationDemystifying the Attention Logic of Transformers: Unraveling the Intuition and ImplementationJun 17, 20232Jun 17, 20232
Souvik MandalinITNEXTLLM Compression TechniquesEfficient Deployment of Large Language Models through Quantization, Pruning, Distillation compression Techniques.Jun 2Jun 2
Souvik MandalinITNEXTDeep Learning Model Optimization: Why and How?Why do we need model optimization? Different methods of model optimization?Dec 9, 2023Dec 9, 2023
Souvik MandalinITNEXTIMAGEBIND: One Embedding Space To Bind Them AllIMAGEBIND is an approach to learning joint embeddings across six different modalities: image, text, audio, depth, thermal, and IMU data.Dec 4, 20231Dec 4, 20231
Souvik MandalVision Transformers need registersImproving transformer attention by increasing the number of class tokens (registers).Nov 26, 2023Nov 26, 2023
Souvik MandalinITNEXTTokenLearner: What Can 8 Learned Tokens Do for Images and Videos?An efficient way to reduce the number of tokens, and computation requirements with little to no decrease in performance.Nov 19, 20233Nov 19, 20233
Souvik MandalinITNEXTMasked Autoencoders As Spatiotemporal LearnersThis paper shows the results and findings of applying MAE on a video dataset. The same concepts should be applicable to other 3-dimensional…Aug 23, 2023Aug 23, 2023
Souvik MandalinITNEXTI-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive ArchitectureThis paper from Meta proposes a self-supervised technique which tries to learn highly semantic image features without relying on…Aug 11, 2023Aug 11, 2023