🤖 AI Summary
Machine learning models often amplify biases present in training data, leading to disparate performance across demographic groups. This work establishes the first exact analytical theory for ridge regression—including randomized projection—to rigorously characterize the phase transitions underlying bias amplification and minority-group overfitting within a solvable high-dimensional model. The theory reveals that both regularization strength and training duration admit optimal values that minimize bias; critically, the gap in group-wise test errors does not vanish as model parameter count grows. By unifying high-dimensional asymptotics with random projection theory, our framework jointly quantifies how model architecture and data distribution co-determine bias. Empirical validation on synthetic and semi-synthetic datasets confirms strong agreement between theoretical predictions and observed phenomena—including bias amplification and minority overfitting—providing interpretable, quantitative guidance for fairness-aware model design.
📝 Abstract
Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups. To better understand, evaluate, and mitigate these possible biases, a deeper theoretical understanding of how model design choices and data distribution properties could contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we demonstrate that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be fundamental differences in test error between groups that do not vanish with increased parameterization. Importantly, our theoretical predictions align with several empirical observations reported in the literature. We extensively empirically validate our theory on diverse synthetic and semi-synthetic datasets.