🤖 AI Summary
Deep learning models often suffer from poor generalization and untrustworthy predictions due to spurious correlations—such as background-label confounding—in training data. To address this, we propose a novel unsupervised debiasing framework based on conditional diffusion models. Leveraging their intrinsic sensitivity to bias, our method generates bias-aligned images to train a bias amplifier, effectively transforming the diffusion model’s inherent “defects” into an interpretable and controllable debiasing tool. Concurrently, it mitigates over-memorization of training data. Crucially, our approach requires no human-annotated bias labels. Evaluated on multiple benchmark datasets, it significantly outperforms existing state-of-the-art methods, substantially improving model robustness and trustworthiness under out-of-distribution conditions. By enabling bias modeling and disentanglement without supervision, our work establishes a new paradigm for addressing data bias in deep learning.
📝 Abstract
Deep learning model effectiveness in classification tasks is often challenged by the quality and quantity of training data which, whenever containing strong spurious correlations between specific attributes and target labels, can result in unrecoverable biases in model predictions. Tackling these biases is crucial in improving model generalization and trust, especially in real-world scenarios. This paper presents Diffusing DeBias (DDB), a novel approach acting as a plug-in for common methods in model debiasing while exploiting the inherent bias-learning tendency of diffusion models. Our approach leverages conditional diffusion models to generate synthetic bias-aligned images, used to train a bias amplifier model, to be further employed as an auxiliary method in different unsupervised debiasing approaches. Our proposed method, which also tackles the common issue of training set memorization typical of this type of tech- niques, beats current state-of-the-art in multiple benchmark datasets by significant margins, demonstrating its potential as a versatile and effective tool for tackling dataset bias in deep learning applications.