🤖 AI Summary
This work proposes BISE, a novel debiasing approach that challenges the prevailing reliance on external unbiased data or costly retraining. It demonstrates for the first time that standardly trained models inherently contain low-bias subnetworks that can be directly leveraged without parameter modification. Through structure-adaptive pruning, BISE efficiently extracts such unbiased substructures from pretrained models without fine-tuning, altering original parameters, or requiring any external unbiased data. Extensive experiments across multiple benchmarks show that BISE substantially reduces algorithmic bias while preserving model performance, achieving this with significantly lower computational overhead compared to existing debiasing strategies.
📝 Abstract
The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible to extract fair and bias-agnostic subnetworks from standard vanilla-trained models without relying on additional data, such as unbiased training set? In this work, we introduce Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates"bias-free"subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters. Our approach demonstrates that such subnetworks can be extracted via pruning and can operate without modification, effectively relying less on biased features and maintaining robust performance. Our findings contribute towards efficient bias mitigation through structural adaptation of pre-trained neural networks via parameter removal, as opposed to costly strategies that are either data-centric or involve (re)training all model parameters. Extensive experiments on common benchmarks show the advantages of our approach in terms of the performance and computational efficiency of the resulting debiased model.