🤖 AI Summary
This paper addresses selection and label bias in fairness evaluation for automated decision-making (e.g., hiring), which leads mainstream interventions—such as resampling—to produce a “fairness illusion.” To mitigate this, we propose an audit-inspired fictitious applicant experimental framework integrating randomized controlled trials, base-rate parity, and individual treatment effect (ITE) estimation. Our method enables more realistic fairness assessment and model training. Key contributions include: (i) the first incorporation of ITE estimation into algorithmic fairness intervention, thereby reducing overreliance on aggregate group-level metrics; and (ii) empirical evidence demonstrating that conventional methods conceal approximately 10% residual discrimination, whereas our approach significantly reduces actual discriminatory outcomes. The framework enhances internal validity of fairness evaluation and improves intervention efficacy.
📝 Abstract
Artificial intelligence systems, especially those using machine learning, are being deployed in domains from hiring to loan issuance in order to automate these complex decisions. Judging both the effectiveness and fairness of these AI systems, and their human decision making counterpart, is a complex and important topic studied across both computational and social sciences. Within machine learning, a common way to address bias in downstream classifiers is to resample the training data to offset disparities. For example, if hiring rates vary by some protected class, then one may equalize the rate within the training set to alleviate bias in the resulting classifier. While simple and seemingly effective, these methods have typically only been evaluated using data obtained through convenience samples, introducing selection bias and label bias into metrics. Within the social sciences, psychology, public health, and medicine, audit studies, in which fictitious ``testers'' (e.g., resumes, emails, patient actors) are sent to subjects (e.g., job openings, businesses, doctors) in randomized control trials, provide high quality data that support rigorous estimates of discrimination. In this paper, we investigate how data from audit studies can be used to improve our ability to both train and evaluate automated hiring algorithms. We find that such data reveals cases where the common fairness intervention method of equalizing base rates across classes appears to achieve parity using traditional measures, but in fact has roughly 10% disparity when measured appropriately. We additionally introduce interventions based on individual treatment effect estimation methods that further reduce algorithmic discrimination using this data.