Fairness Audits of Institutional Risk Models in Deployed ML Pipelines

📅 2026-04-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
This study addresses fairness deficiencies in machine learning–based early warning systems used by higher education institutions for allocating student support resources, particularly with respect to disparities arising from gender, age, and residency status. Through a long-term collaboration with Centennial College, the authors replicate the institution’s deployed system and develop the first reproducible auditing framework that integrates construct validity with statistical fairness metrics to systematically evaluate the entire pipeline—from data collection and prediction to post-processing. Their analysis reveals that younger, male, and international students are systematically assigned higher risk scores than their actual risk levels warrant, while older and female students with equivalent risk profiles are consistently underestimated. Notably, bias is significantly amplified during the post-processing stage. This work provides both methodological innovation and empirical evidence to advance fairness auditing of institutionalized machine learning systems.

Technology Category

Application Category

📝 Abstract
Fairness audits of institutional risk models are critical for understanding how deployed machine learning pipelines allocate resources. Drawing on multi-year collaboration with Centennial College, where our prior ethnographic work introduced the ASP-HEI Cycle, we present a replica-based audit of a deployed Early Warning System (EWS), replicating its model using institutional training data and design specifications. We evaluate disparities by gender, age, and residency status across the full pipeline (training data, model predictions, and post-processing) using standard fairness metrics. Our audit reveals systematic misallocation: younger, male, and international students are disproportionately flagged for support, even when many ultimately succeed, while older and female students with comparable dropout risk are under-identified. Post-processing amplifies these disparities by collapsing heterogeneous probabilities into percentile-based risk tiers. This work provides a replicable methodology for auditing institutional ML systems and shows how disparities emerge and compound across stages, highlighting the importance of evaluating construct validity alongside statistical fairness. It contributes one empirical thread to a broader program investigating algorithms, student data, and power in higher education.
Problem

Research questions and friction points this paper is trying to address.

fairness audit
institutional risk models
machine learning pipelines
algorithmic bias
educational equity
Innovation

Methods, ideas, or system contributions that make the work stand out.

fairness audit
replica-based modeling
early warning system
algorithmic bias
construct validity