BED-SAM2: Boundary-Enhanced-Depth SAM2 via Monocular Geometric Priors

📅 2026-05-24

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the limited boundary segmentation accuracy of vision foundation models in salient and camouflaged object detection by proposing a novel approach based on the SAM2 architecture. The method enhances boundary awareness through an improved Hiera encoder that implicitly encodes monocular depth geometric priors directly from RGB images and integrates them into the segmentation pipeline. To the best of our knowledge, this is the first study to incorporate monocular depth priors into the SAM2 framework. The model achieves efficient convergence within only five training epochs and attains state-of-the-art performance across multiple benchmarks for both salient and camouflaged object detection, significantly improving boundary segmentation accuracy—particularly for objects in complex scenes and those with strong camouflage.

📝 Abstract

Building upon the SAM2 vision foundation model for downstream segmentation, this study introduces Boundary Enhanced Depth (BED)-SAM2. The SAM2 Hiera encoder architecture is modified to directly encode monocular depth information from RGB images, thereby providing geometric cues that enhance object boundary delineation and facilitate the extraction of camouflaged object shapes. BED-SAM2 demonstrates competitive state-of-the-art performance across multiple salient and camouflaged object detection tasks with as few as five training epochs.

Problem

Research questions and friction points this paper is trying to address.

object boundary delineation

camouflaged object detection

monocular depth

segmentation

geometric cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boundary-Enhanced Depth

SAM2

Monocular Depth Estimation