🤖 AI Summary
This study investigates the joint optimization of fairness and accuracy for machine learning models in university admissions. Addressing bias in application texts, we propose a gender-debiasing preprocessing pipeline integrating contextualized word embedding correction with domain-informed rule-based constraints. We further introduce a consistency-based evaluation framework grounded in individual fairness principles. Using real-world admissions data, we encode textual features via BERT and compare multiple classifiers—XGBoost, Bi-LSTM, and KNN—demonstrating that debiasing preserves or improves predictive accuracy while substantially enhancing individual fairness: fairness scores increase by 14.08%–18.79% over human expert decisions. To our knowledge, this is the first empirical study to show that structured debiasing coupled with rigorous fairness evaluation enables simultaneous gains in both fairness and model performance—challenging the conventional assumption of an inherent trade-off between these objectives.
📝 Abstract
Fairness in both machine learning (ML) predictions and human decisions is critical, with ML models prone to algorithmic and data bias, and human decisions affected by subjectivity and cognitive bias. This study investigates fairness using a real-world university admission dataset with 870 profiles, leveraging three ML models, namely XGB, Bi-LSTM, and KNN. Textual features are encoded with BERT embeddings. For individual fairness, we assess decision consistency among experts with varied backgrounds and ML models, using a consistency score. Results show ML models outperform humans in fairness by 14.08% to 18.79%. For group fairness, we propose a gender-debiasing pipeline and demonstrate its efficacy in removing gender-specific language without compromising prediction performance. Post-debiasing, all models maintain or improve their classification accuracy, validating the hypothesis that fairness and performance can coexist. Our findings highlight ML's potential to enhance fairness in admissions while maintaining high accuracy, advocating a hybrid approach combining human judgement and ML models.