Depth Edge Alignment Loss: DEALing with Depth in Weakly Supervised Semantic Segmentation

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high annotation cost of weakly supervised semantic segmentation in autonomous robotics, this paper proposes a model-agnostic Deep Edge Alignment Loss (DEAL) that leverages readily available depth maps to generate high-quality pixel-level pseudo-labels under image-level supervision. DEAL explicitly aligns edge structures between RGB images and depth map gradients, thereby enhancing spatial consistency of weak supervision signals without requiring additional manual annotations. It is plug-and-play compatible with mainstream segmentation architectures. Experiments on PASCAL VOC, MS COCO, and HOPE demonstrate consistent improvements in mean Intersection-over-Union (mIoU) by 5.44, 1.27, and 16.42 percentage points, respectively—substantially outperforming existing weakly supervised methods. These results validate the effectiveness of depth modality as a geometric prior for guiding weakly supervised semantic segmentation.

Technology Category

Application Category

📝 Abstract
Autonomous robotic systems applied to new domains require an abundance of expensive, pixel-level dense labels to train robust semantic segmentation models under full supervision. This study proposes a model-agnostic Depth Edge Alignment Loss to improve Weakly Supervised Semantic Segmentation models across different datasets. The methodology generates pixel-level semantic labels from image-level supervision, avoiding expensive annotation processes. While weak supervision is widely explored in traditional computer vision, our approach adds supervision with pixel-level depth information, a modality commonly available in robotic systems. We demonstrate how our approach improves segmentation performance across datasets and models, but can also be combined with other losses for even better performance, with improvements up to +5.439, +1.274 and +16.416 points in mean Intersection over Union on the PASCAL VOC / MS COCO validation, and the HOPE static onboarding split, respectively. Our code will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

Reducing expensive pixel-level annotation needs for semantic segmentation
Improving weakly supervised segmentation using depth information from robotics
Generating pixel-level semantic labels from image-level supervision only
Innovation

Methods, ideas, or system contributions that make the work stand out.

Depth Edge Alignment Loss for weakly supervised segmentation
Uses depth information to improve semantic label generation
Model-agnostic approach compatible with various segmentation models
🔎 Similar Papers
No similar papers found.