Ganesh BajajinArtificial Intelligence in Plain EnglishThe Expanding Llama Family: A Detailed Overview of Meta’s Open-Source LLMsMeta’s Llama family of large language models (LLMs) has rapidly evolved since its initial release, offering a powerful and accessible suite…1d ago
Elahe Aghapour & Salar RahiliinTowards Data ScienceFrom Unimodals to Multimodality: DIY Techniques for Building Foundational ModelsUsing advanced techniques like prompt adaptation and adapters to transform open-source unimodal models into multimodal onesJun 25
Shubham KarwaExploring Multimodal Large Language Models: A Step Forward in AIIn the dynamic realm of artificial intelligence, the advent of Multimodal Large Language Models (MLLMs) is revolutionizing how we interact…Nov 16, 20234Nov 16, 20234
Kirushikesh DBBuilding a Grammar Assistant with ASR, NLP, and TTS: A Deep DiveIn today’s digital age, effective communication is more important than ever. Whether you’re a non-native English speaker looking to improve…Oct 6Oct 6
TK YeowWhy LlamaParse could be better than GPT on multimodal parsing?What if there’s a multimodal document parser which works decently on extracting modality such as images/tables/formula at almost 0 cost…Sep 4Sep 4
Ganesh BajajinArtificial Intelligence in Plain EnglishThe Expanding Llama Family: A Detailed Overview of Meta’s Open-Source LLMsMeta’s Llama family of large language models (LLMs) has rapidly evolved since its initial release, offering a powerful and accessible suite…1d ago
Elahe Aghapour & Salar RahiliinTowards Data ScienceFrom Unimodals to Multimodality: DIY Techniques for Building Foundational ModelsUsing advanced techniques like prompt adaptation and adapters to transform open-source unimodal models into multimodal onesJun 25
Shubham KarwaExploring Multimodal Large Language Models: A Step Forward in AIIn the dynamic realm of artificial intelligence, the advent of Multimodal Large Language Models (MLLMs) is revolutionizing how we interact…Nov 16, 20234
Kirushikesh DBBuilding a Grammar Assistant with ASR, NLP, and TTS: A Deep DiveIn today’s digital age, effective communication is more important than ever. Whether you’re a non-native English speaker looking to improve…Oct 6
TK YeowWhy LlamaParse could be better than GPT on multimodal parsing?What if there’s a multimodal document parser which works decently on extracting modality such as images/tables/formula at almost 0 cost…Sep 4
Moein ShariatniainTowards Data ScienceSimple Implementation of OpenAI CLIP model: A TutorialA tutorial on simple implementation of CLIP model in PyTorch.Apr 7, 202113
Simeon EmanuilovLLaVA-OneVision: Pushing boundaries in multimodal AI with visual task transferIn the rapidly evolving field of artificial intelligence, multimodal models that can process and understand both visual and textual…Aug 12