SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited generalization and accuracy of foundation models like SAM on downstream segmentation tasks, this paper proposes a high-resolution adaptive encoder framework. The method enhances SAM2 by (1) introducing a dual-resolution input mechanism coupled with dense glue layers to strengthen multi-scale feature fusion; (2) integrating DINOv2 as an auxiliary encoder to enrich semantic representation; and (3) simplifying the decoder architecture to improve model efficiency and cross-task generalizability. Built upon the SAM2-UNet backbone, the proposed model achieves state-of-the-art performance on four challenging benchmarks: binary segmentation, camouflaged object detection, marine organism segmentation, and remote sensing salient object detection. Comprehensive evaluations demonstrate superior robustness and strong domain adaptation capability across diverse segmentation scenarios.

Technology Category

Application Category

📝 Abstract
Recent studies have highlighted the potential of adapting the Segment Anything Model (SAM) for various downstream tasks. However, constructing a more powerful and generalizable encoder to further enhance performance remains an open challenge. In this work, we propose SAM2-UNeXT, an advanced framework that builds upon the core principles of SAM2-UNet while extending the representational capacity of SAM2 through the integration of an auxiliary DINOv2 encoder. By incorporating a dual-resolution strategy and a dense glue layer, our approach enables more accurate segmentation with a simple architecture, relaxing the need for complex decoder designs. Extensive experiments conducted on four benchmarks, including dichotomous image segmentation, camouflaged object detection, marine animal segmentation, and remote sensing saliency detection, demonstrate the superior performance of our proposed method. The code is available at https://github.com/WZH0120/SAM2-UNeXT.
Problem

Research questions and friction points this paper is trying to address.

Enhancing SAM2 for better downstream segmentation tasks
Improving encoder generalizability without complex decoders
Achieving accurate segmentation via dual-resolution strategy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates DINOv2 encoder with SAM2
Uses dual-resolution strategy for accuracy
Simplifies architecture with dense glue layer
🔎 Similar Papers
No similar papers found.
Xinyu Xiong
Xinyu Xiong
Sun Yat-sen University; HIKVISION
Zihuang Wu
Zihuang Wu
Jiangxi Normal University
L
Lei Zhang
Sun Yat-sen University
L
Lei Lu
Hainan University
M
Ming Li
Shandong Inspur Database Technology Co., Ltd
G
Guanbin Li
Sun Yat-sen University