Michael XinArtificial Intelligence in Plain EnglishOverview of Multimodal LLMs — Algorithm, Dataset And Evaluation: Dataset ConstructionThe collection of multimodal instruction-following data is a key to train multi-modal language models.Jun 121Jun 121
Michael XinArtificial Intelligence in Plain EnglishHow To Train Multimodal LLMs To Understand And Interact With Text, Image, Video And Audio: Model…Multimodal instruction models can be evaluated using closed-set and open-set questions, as well as qualitative assessments.Apr 26Apr 26
Michael XinArtificial Intelligence in Plain EnglishHow To Train Multimodal LLMs To Understand And Interact With Text, Image, Video And Audio: Model…Apr 2Apr 2
Michael XinArtificial Intelligence in Plain EnglishHow To Train Multimodal LLMs To Understand And Interact With Text, Image, Video And Audio: Model…Current MLLMs insert visual embeddings from vision experts into pre-trained language embedding space. Key works are introduced.Mar 21Mar 21
Michael XinArtificial Intelligence in Plain EnglishHow To Train Multimodal LLMs To Understand And Interact With Text, Image, Video And AudioA concise introduction to the world of multimodal Large Language Models (LLMs), an overview of their background and how to train them.Mar 8Mar 8
Michael XGenerative AI for Innovative Photo & Video Editing & Creation — APPs and Enabling DatasetsIn the ever-evolving realm of digital artistry, the emergence of Generative Artificial Intelligence (AI) has unlocked a vast array of…Dec 19, 2023Dec 19, 2023
Michael XinArtificial Intelligence in Plain EnglishEnhancing LLMs With Vision Experts (Part 3)If you missed out on the previous articles of this series, please read:Nov 22, 2023Nov 22, 2023
Michael XinArtificial Intelligence in Plain EnglishEnhancing LLMs With Vision Experts (Part 2)If you missed out on the first article of this series, please read:Nov 15, 2023Nov 15, 2023
Michael XinArtificial Intelligence in Plain EnglishEnhancing LLMs With Vision Experts (Part 1)AbstractNov 9, 2023Nov 9, 2023
Michael XinArtificial Intelligence in Plain EnglishAutonomous Driving Technology Revolution : From SLAM+DL to BEV+Transformer (Part 3)If you missed out on the first article of this series, please click:Oct 25, 2023Oct 25, 2023