Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior

๐Ÿ“… 2025-03-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge in high-resolution natural image binary segmentation (DIS) where local detail optimization is well-explored but global object integrity modeling remains insufficient, this paper proposes the Patch-Depth Fusion Network (PDFNet). Methodologically, we introduce pseudo-depth mapsโ€”generated by Depth Anything V2โ€”as a structural integrity prior for objects; design a shared encoder, a depth-refinement decoder, and a patch-level feature selection enhancement module; and propose a depth integrity prior loss to jointly optimize local detail sensitivity and global structural consistency. On the DIS-5K benchmark, PDFNet significantly outperforms mainstream non-diffusion methods and matches or surpasses state-of-the-art diffusion-based models, while using only 11% of their parameter count. Key contributions include: (i) the formulation of depth-based object integrity priors, (ii) a novel multimodal patch-depth fusion architecture, and (iii) a lightweight, computationally efficient design paradigm.

Technology Category

Application Category

๐Ÿ“ Abstract
Dichotomous Image Segmentation (DIS) is a high-precision object segmentation task for high-resolution natural images. The current mainstream methods focus on the optimization of local details but overlook the fundamental challenge of modeling the integrity of objects. We have found that the depth integrity-prior implicit in the the pseudo-depth maps generated by Depth Anything Model v2 and the local detail features of image patches can jointly address the above dilemmas. Based on the above findings, we have designed a novel Patch-Depth Fusion Network (PDFNet) for high-precision dichotomous image segmentation. The core of PDFNet consists of three aspects. Firstly, the object perception is enhanced through multi-modal input fusion. By utilizing the patch fine-grained strategy, coupled with patch selection and enhancement, the sensitivity to details is improved. Secondly, by leveraging the depth integrity-prior distributed in the depth maps, we propose an integrity-prior loss to enhance the uniformity of the segmentation results in the depth maps. Finally, we utilize the features of the shared encoder and, through a simple depth refinement decoder, improve the ability of the shared encoder to capture subtle depth-related information in the images. Experiments on the DIS-5K dataset show that PDFNet significantly outperforms state-of-the-art non-diffusion methods. Due to the incorporation of the depth integrity-prior, PDFNet achieves or even surpassing the performance of the latest diffusion-based methods while using less than 11% of the parameters of diffusion-based methods. The source code at https://github.com/Tennine2077/PDFNet.
Problem

Research questions and friction points this paper is trying to address.

High-precision object segmentation in high-resolution natural images.
Overcoming challenges in modeling object integrity and local detail optimization.
Leveraging depth integrity-prior and patch fine-grained strategy for improved segmentation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Patch-Depth Fusion Network for precise segmentation
Depth integrity-prior loss enhances segmentation uniformity
Shared encoder with depth refinement decoder
๐Ÿ”Ž Similar Papers
No similar papers found.