arXiv Weekly Roundup #14

Explanatory AI
5 min readJun 30, 2023

--

Greetings, Medium community,

This edition covers publications published on arXiv from June 29th to May 5th and shares insights and analysis on the most significant research and trends.

Let’s dive in!

3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable Medical Image Segmentation

Extention of SAM to 3D tasks.

Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and not stable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned.

Overview of our proposed method for 3DSAM-adapter. By arxiv.org/abs/2306.13465

Learn over Past, Evolve for Future: Forecasting Temporal Trends for Fake News Detection

Improving the quality of fake news detection by accounting for historical trends in the topics.

Fake news detection has been a critical task for maintaining the health of the online news ecosystem. However, very few existing works consider the temporal shift issue caused by the rapidly-evolving nature of news data in practice, resulting in significant performance degradation when training on past data and testing on future data. In this paper, we observe that the appearances of news events on the same topic may display discernible patterns over time, and posit that such patterns can assist in selecting training instances that could make the model adapt better to future data.

The architecture of the proposed FTT (Forecasting Temporal Trends) framework. By arxiv.org/abs/2306.14728

Low-Confidence Samples Mining for Semi-supervised Object Detection

Reusing low-confidence objects improves recall of semi-supervised approaches for object detection.

Reliable pseudo-labels from unlabeled data play a key role in semi-supervised object detection (SSOD). However, the state-of-the-art SSOD methods all rely on pseudo-labels with high confidence, which ignore valuable pseudo-labels with lower confidence. Additionally, the insufficient excavation for unlabeled data results in an excessively low recall rate thus hurting the network training. In this paper, we propose a novel Low-confidence Samples Mining (LSM) method to utilize lowconfidence pseudo-labels efficiently. Specifically, we develop an additional pseudo information mining (PIM) branch on account of low-resolution feature maps to extract reliable large-area instances, the IoUs of which are higher than small-area ones. Owing to the complementary predictions between PIM and the main branch, we further design self-distillation (SD) to compensate for both in a mutually-learning manner.

LSM pipeline. By arxiv.org/abs/2306.16201
Comparison with the state-of-the-arts from different percentages of labeled MS-COCO. By arxiv.org/abs/2306.16201

Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and Localization

Framework for integration of end-to-end forgery localization and detection optimization into SAM.

The rapid advancements in computer vision have stimulated remarkable progress in face forgery techniques, capturing the dedicated attention of researchers committed to detecting forgeries and precisely localizing manipulated areas. Nonetheless, with limited fine-grained pixelwise supervision labels, deepfake detection models perform unsatisfactorily on precise forgery detection and localization. To address this challenge, we introduce the well-trained vision segmentation foundation model, i.e., Segment Anything Model (SAM) in face forgery detection and localization. Based on SAM, we propose the Detect Any Deepfakes (DADF) framework with the Multiscale Adapter, which can capture short- and long-range forgery contexts for efficient fine-tuning.

Framework of the proposed Detect Any Deepfakes (DADF). By arxiv.org/abs/2306.17075

More to read

Thank you for joining us for this week’s arXiv Computer Science Digest. We hope you found the insights and trends presented here helpful in understanding the latest developments in the field of AI.

If you have any feedback or suggestions, please contact us.

Have a great weekend and see you next Friday. Bye.

--

--