š¤ AI Summary
Deep neural networks often rely on spurious correlationsāe.g., superficial background-class associationsāleading to poor out-of-distribution generalization. To address this, we propose an evidence alignment framework that identifies and suppresses spurious patterns via uncertainty quantification, without requiring group annotations. Our method introduces a novel second-order risk minimization principle for evidence quantification and calibration, enabling theoretically grounded disentanglement of bias-aligned features from invariant (core) featuresāeven in the absence of explicit spurious correlation labels. It supports unsupervised bias modeling, offers scalability across architectures and modalities, and provides rigorous theoretical guarantees. Extensive experiments demonstrate significant improvements in group robustness across diverse benchmarksāincluding vision and multimodal tasksāoutperforming state-of-the-art unlabeled baselines while maintaining computational efficiency and interpretability.
š Abstract
Deep neural networks often learn and rely on spurious correlations, i.e., superficial associations between non-causal features and the targets. For instance, an image classifier may identify camels based on the desert backgrounds. While it can yield high overall accuracy during training, it degrades generalization on more diverse scenarios where such correlations do not hold. This problem poses significant challenges for out-of-distribution robustness and trustworthiness. Existing methods typically mitigate this issue by using external group annotations or auxiliary deterministic models to learn unbiased representations. However, such information is costly to obtain, and deterministic models may fail to capture the full spectrum of biases learned by the models. To address these limitations, we propose Evidential Alignment, a novel framework that leverages uncertainty quantification to understand the behavior of the biased models without requiring group annotations. By quantifying the evidence of model prediction with second-order risk minimization and calibrating the biased models with the proposed evidential calibration technique, Evidential Alignment identifies and suppresses spurious correlations while preserving core features. We theoretically justify the effectiveness of our method as capable of learning the patterns of biased models and debiasing the model without requiring any spurious correlation annotations. Empirical results demonstrate that our method significantly improves group robustness across diverse architectures and data modalities, providing a scalable and principled solution to spurious correlations.