OxEnsemble: Fair Ensembles for Low-Data Classification

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing fair classification in low-data, highly imbalanced medical imaging domains—where false negatives incur high clinical costs—remains challenging due to scarce labeled samples and stringent fairness requirements. Method: We propose a lightweight model ensemble framework that enforces individual fairness constraints. It introduces a novel member-prediction aggregation paradigm under fairness constraints, a data-efficient training mechanism leveraging reusable validation sets, and the first theoretical convergence guarantee for fair ensembling under low-data regimes. Contributions/Results: Evaluated on multiple medical imaging datasets, our method reduces the equal opportunity difference by 37% on average while maintaining >98% accuracy. Training overhead is only marginally higher than single-model fine-tuning, significantly improving the fairness–accuracy trade-off without compromising computational efficiency or clinical utility.

Technology Category

Application Category

📝 Abstract
We address the problem of fair classification in settings where data is scarce and unbalanced across demographic groups. Such low-data regimes are common in domains like medical imaging, where false negatives can have fatal consequences. We propose a novel approach emph{OxEnsemble} for efficiently training ensembles and enforcing fairness in these low-data regimes. Unlike other approaches, we aggregate predictions across ensemble members, each trained to satisfy fairness constraints. By construction, emph{OxEnsemble} is both data-efficient, carefully reusing held-out data to enforce fairness reliably, and compute-efficient, requiring little more compute than used to fine-tune or evaluate an existing model. We validate this approach with new theoretical guarantees. Experimentally, our approach yields more consistent outcomes and stronger fairness-accuracy trade-offs than existing methods across multiple challenging medical imaging classification datasets.
Problem

Research questions and friction points this paper is trying to address.

Addresses fair classification in low-data, unbalanced demographic settings
Proposes OxEnsemble for efficient fairness enforcement with limited data
Improves fairness-accuracy trade-offs in medical imaging classification tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble training with fairness constraints
Data-efficient reuse of held-out data
Compute-efficient fine-tuning of existing models
🔎 Similar Papers
No similar papers found.