🤖 AI Summary
Spurious correlations severely impair model generalization under distribution shifts. This paper first identifies a critical gap between existing benchmarks and real-world bias: they fail to capture both the *strength* and *prevalence* of spurious correlations. To address this, we propose a fine-grained bias disentanglement analysis framework and construct two more realistic bias-evaluation distributions. Then, under the unbiased supervision setting, we introduce a generic, lightweight “Debias in Destruction” (DiD) method—a destructive debiasing paradigm that actively disrupts spurious feature reliance. Theoretically grounded and empirically validated, DiD significantly enhances the robustness and out-of-distribution generalization of mainstream debiasing methods across image and language tasks. Cross-modal experiments further confirm its effectiveness and transferability. Our work establishes a new paradigm for modeling and mitigating spurious correlations in practical scenarios.
📝 Abstract
Spurious correlations in training data significantly hinder the generalization capability of machine learning models when faced with distribution shifts in real-world scenarios.To tackle the problem, numerous debiasing approaches have been proposed and benchmarked on datasets intentionally designed with severe biases.However, it remains to be asked: extit{1. Do existing benchmarks really capture biases in the real world? 2. Can existing debiasing methods handle biases in the real world?} To answer the questions, we revisit biased distributions in existing benchmarks and real-world datasets, and propose a fine-grained framework for analyzing dataset bias by disentangling it into the magnitude and prevalence of bias. We empirically and theoretically identify key characteristics of real-world biases poorly represented by existing benchmarks. We further introduce two novel biased distributions to bridge this gap, forming a systematic evaluation framework for real-world debiasing.With the evaluation framework, we focus on the practical setting of debiasing w/o bias supervision and find existing methods incapable of handling real-world biases.Through in-depth analysis, we propose a simple yet effective approach that can be easily applied to existing debiasing methods, named Debias in Destruction (DiD).Empirical results on real-world datasets in both image and language modalities demonstrate the superiority of DiD, improving the performance of existing methods on all types of biases.