Tatsuyuki SekineinRemote Sensing Tech by ELSPINA VEINZ Inc.Paper reading: GeoChatLarge Vision-language Models (VLMs) have shown great success in natural image domains. However, there is a performance drop in remote…Aug 19
Subarna TripathiinTowards Data ScienceLong-form video representation learning (Part 3: Long-form egocentric video representation…We explore novel video representation learning methods that are equipped with long-form reasoning capability. This is Part III providing a…May 14
Djohra IBERRAKENTop papers from CVPR 2024 : Comprehensive OverviewOne of the most prestigious conferences in the field of AI, CVPR for Computer Vision and Pattern Recognition, is currently taking place…Jun 21Jun 21
Taks@skyfoliage.comRead the MaskGIT (CVPR2022) paper and share I thoughtsI was reading the paper Genie by J. Bruce from Google DeepMind, which was published at ICML 2024. Since it mentions that many of its…Aug 5Aug 5
Subarna TripathiinTowards Data ScienceLong-form video representation learning (Part 2: Video as sparse transformers)We explore novel video representations methods that are equipped with long-form reasoning capability. This is part II focusing on sparse…May 14May 14
Tatsuyuki SekineinRemote Sensing Tech by ELSPINA VEINZ Inc.Paper reading: GeoChatLarge Vision-language Models (VLMs) have shown great success in natural image domains. However, there is a performance drop in remote…Aug 19
Subarna TripathiinTowards Data ScienceLong-form video representation learning (Part 3: Long-form egocentric video representation…We explore novel video representation learning methods that are equipped with long-form reasoning capability. This is Part III providing a…May 14
Djohra IBERRAKENTop papers from CVPR 2024 : Comprehensive OverviewOne of the most prestigious conferences in the field of AI, CVPR for Computer Vision and Pattern Recognition, is currently taking place…Jun 21
Taks@skyfoliage.comRead the MaskGIT (CVPR2022) paper and share I thoughtsI was reading the paper Genie by J. Bruce from Google DeepMind, which was published at ICML 2024. Since it mentions that many of its…Aug 5
Subarna TripathiinTowards Data ScienceLong-form video representation learning (Part 2: Video as sparse transformers)We explore novel video representations methods that are equipped with long-form reasoning capability. This is part II focusing on sparse…May 14
The Tenyks BloggerCVPR 2024: Image and Video Search & Understanding (RAG, Multimodal, Embeddings, and more)We breakdown the top papers in Image and Video Search & Understanding from CVPR 2024!Jun 14
Nandini Lokesh ReddyRevolutionizing Object Recognition: From Predefined Labels to Dynamic PredictionsIn the computer vision domain, a fundamental problem is object recognition and classification. To recognize a single object, we need to…Aug 1
Subarna TripathiinTowards Data ScienceLong-form video representation learning (Part 1: Video as graphs)We explore novel video representations methods that are equipped with long-form reasoning capability. This is part 1 focusing on video…May 141