Mitigating Bias Using Model-Agnostic Data Attribution

📅 2024-05-08

🏛️ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

📈 Citations: 2

✨ Influential: 0

career value

145K/year

🤖 AI Summary

To address model bias in image data, this paper proposes a model-agnostic data attribution and intervention method. First, lightweight CNNs are trained on small image patches to localize pixel-level bias-sensitive regions. Subsequently, targeted noise is injected into these regions to perform adversarial perturbations at the data level, thereby constraining the model to learn bias-invariant features. The method introduces a novel “single-patch prediction of global bias attributes” attribution mechanism—enabling bias localization without modifying model architecture or loss functions—thus ensuring both interpretability and broad applicability. Evaluated on strongly biased datasets (e.g., gender- or race-related classification tasks), it significantly improves fairness metrics (e.g., reducing ΔDP and ΔEO) while preserving primary task accuracy, demonstrating the effectiveness and practicality of data-level bias mitigation.

Technology Category

Application Category

📝 Abstract

Mitigating bias in machine learning models is a critical endeavor for ensuring fairness and equity. In this paper, we propose a novel approach to address bias by leveraging pixel image attributions to identify and regularize regions of images containing significant information about bias attributes. Our method utilizes a model-agnostic approach to extract pixel attributions by employing a convolutional neural network (CNN) classifier trained on small image patches. By training the classifier to predict a property of the entire image using only a single patch, we achieve region-based attributions that provide insights into the distribution of important information across the image. We propose utilizing these attributions to introduce targeted noise into datasets with confounding attributes that bias the data, thereby constraining neural networks from learning these biases and emphasizing the primary attributes. Our approach demonstrates its efficacy in enabling the training of unbiased classifiers on heavily biased datasets.

Problem

Research questions and friction points this paper is trying to address.

Mitigating bias in machine learning models

Identifying bias regions using pixel attributions

Training unbiased classifiers on biased datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pixel image attributions for bias identification

Employs model-agnostic CNN for patch-based predictions

Introduces targeted noise to mitigate dataset biases

🔎 Similar Papers

No similar papers found.