Rethinking Debiasing: Real-World Bias Analysis and Mitigation

📅 2024-05-24
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Spurious correlations severely impair model generalization under distribution shifts. This paper first identifies a critical gap between existing benchmarks and real-world bias: they fail to capture both the *strength* and *prevalence* of spurious correlations. To address this, we propose a fine-grained bias disentanglement analysis framework and construct two more realistic bias-evaluation distributions. Then, under the unbiased supervision setting, we introduce a generic, lightweight “Debias in Destruction” (DiD) method—a destructive debiasing paradigm that actively disrupts spurious feature reliance. Theoretically grounded and empirically validated, DiD significantly enhances the robustness and out-of-distribution generalization of mainstream debiasing methods across image and language tasks. Cross-modal experiments further confirm its effectiveness and transferability. Our work establishes a new paradigm for modeling and mitigating spurious correlations in practical scenarios.

Technology Category

Application Category

📝 Abstract
Spurious correlations in training data significantly hinder the generalization capability of machine learning models when faced with distribution shifts in real-world scenarios.To tackle the problem, numerous debiasing approaches have been proposed and benchmarked on datasets intentionally designed with severe biases.However, it remains to be asked: extit{1. Do existing benchmarks really capture biases in the real world? 2. Can existing debiasing methods handle biases in the real world?} To answer the questions, we revisit biased distributions in existing benchmarks and real-world datasets, and propose a fine-grained framework for analyzing dataset bias by disentangling it into the magnitude and prevalence of bias. We empirically and theoretically identify key characteristics of real-world biases poorly represented by existing benchmarks. We further introduce two novel biased distributions to bridge this gap, forming a systematic evaluation framework for real-world debiasing.With the evaluation framework, we focus on the practical setting of debiasing w/o bias supervision and find existing methods incapable of handling real-world biases.Through in-depth analysis, we propose a simple yet effective approach that can be easily applied to existing debiasing methods, named Debias in Destruction (DiD).Empirical results on real-world datasets in both image and language modalities demonstrate the superiority of DiD, improving the performance of existing methods on all types of biases.
Problem

Research questions and friction points this paper is trying to address.

Analyzing real-world dataset biases in machine learning models
Evaluating debiasing methods' effectiveness on real-world biases
Proposing a new debiasing approach for real-world scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained framework for bias analysis
Novel biased distributions for evaluation
Debias in Destruction (DiD) method
🔎 Similar Papers
No similar papers found.
Z
Zhibo Wang
The State Key Laboratory of Blockchain and Data Security, Zhejiang University; School of Cyber Science and Technology, Zhejiang University
P
Peng Kuang
School of Cyber Science and Technology, Zhejiang University
Zhixuan Chu
Zhixuan Chu
Associate Professor, Zhejiang University; Alibaba Group; Ant Group
J
Jingyi Wang
School of Control Science and Engineering, Zhejiang University
Kui Ren
Kui Ren
Professor and Dean of Computer Science, Zhejiang University, ACM/IEEE Fellow
Data Security & PrivacyAI SecurityIoT & Vehicular Security