🤖 AI Summary
Instance-level annotations for semantic segmentation are costly and prone to noise, severely compromising model robustness and generalization. To address this, we introduce the first systematic noise benchmark suite for instance segmentation—COCO-N, Cityscapes-N, and COCO-WAN—covering both realistic annotation noise and weakly supervised noise. Our methodology comprises four components: synthetic noise modeling, foundation-model-based weak-label generation, semi-automated annotation simulation, and a multi-scale segmentation evaluation framework. Extensive experiments reveal that mainstream label-noise robustness methods consistently fail on boundary-sensitive segmentation tasks. Quantitative analysis further demonstrates the limited efficacy of existing denoising techniques in instance segmentation. This work establishes the first reproducible, noise-robust instance segmentation benchmark and diagnostic toolkit, providing both methodological guidelines and standardized data resources to advance future research.
📝 Abstract
Obtaining accurate labels for instance segmentation is particularly challenging due to the complex nature of the task. Each image necessitates multiple annotations, encompassing not only the object class but also its precise spatial boundaries. These requirements elevate the likelihood of errors and inconsistencies in both manual and automated annotation processes. By simulating different noise conditions, we provide a realistic scenario for assessing the robustness and generalization capabilities of instance segmentation models in different segmentation tasks, introducing COCO-N and Cityscapes-N. We also propose a benchmark for weakly annotation noise, dubbed COCO-WAN, which utilizes foundation models and weak annotations to simulate semi-automated annotation tools and their noisy labels. This study sheds light on the quality of segmentation masks produced by various models and challenges the efficacy of popular methods designed to address learning with label noise.