🤖 AI Summary
This work addresses the inaccurate reliability estimation of boundary pixels in the Segment Anything Model (SAM) under domain shift, which stems from mask-level confidence confusion (MCC). To tackle this issue, the authors propose the RUAC framework, which, for the first time in zero-shot segmentation, jointly perturbs texture and geometric structure through a coordinated style-deformation adversarial attack to train a lightweight uncertainty head. Furthermore, an uncertainty–accuracy alignment mechanism is introduced to concentrate uncertainty estimates on erroneously predicted regions. Evaluated across 23 zero-shot domains, the method substantially improves both segmentation quality and uncertainty calibration, achieving stronger correlation between predictive uncertainty and actual accuracy.
📝 Abstract
Despite strong zero-shot performance, SAM is unreliable under domain shift due to Mask-level Confidence Confusion (MCC), where a single IoU-based mask score fails to reflect pixel-wise reliability near boundaries. Motivated by the contrast between texture-biased shortcuts in neural networks and shape-centric processing in human vision, we model out-of-domain variation as appearance shifts and non-rigid deformations that jointly stress calibration. We propose Segment Anything with Robust Uncertainty-Accuracy Correlation (RUAC) for robust pixel-wise uncertainty estimation under appearance and deformation shifts. RUAC adds a lightweight uncertainty head, trains it with a collaborative style-deformation attack that jointly perturbs texture and geometry, and applies Uncertainty-Accuracy Alignment to ensure uncertainty consistently highlights erroneous pixels even under adversarial perturbations. Across 23 zero-shot domains, RUAC improves segmentation quality and yields more faithful uncertainty with stronger uncertainty-accuracy correlation. Project page: https://github.com/HongyouZhou/ruac.git.