François Pheindata from the trenchesDemystifying Multimodal LLMUnlocking the Power of Fusion in Language and VisionMar 211Mar 211
François Pheindata from the trenchesPaying Attention to Text and Images for Visual Question AnsweringAttention first originated in translation systems as a way to focus on parts of the input sentence, when generating words of the translated…Dec 15, 2022Dec 15, 2022