🤖 AI Summary
This study investigates how AI-driven racial bias undermines human decision autonomy in resume screening. Using a controlled experimental design, we introduced racially biased AI recommendations, quality-calibrated resume stimuli, and large-scale behavioral data analysis. Results reveal a “low-trust–high-compliance” paradox: even when participants explicitly doubted the AI’s recommendation quality and relevance, their final hiring decisions remained strongly influenced by the AI’s bias—exhibiting up to 90% preference for specific racial groups. Critically, we provide the first empirical validation that administering an Implicit Association Test (IAT) as a pre-decisional intervention significantly mitigates this effect, increasing selection rates for underrepresented groups by 13%. These findings offer rigorous evidence for human-AI collaboration dynamics and inform bias governance strategies and human-centered AI design principles.
📝 Abstract
In this study, we conduct a resume-screening experiment (N=528) where people collaborate with simulated AI models exhibiting race-based preferences (bias) to evaluate candidates for 16 high and low status occupations. Simulated AI bias approximates factual and counterfactual estimates of racial bias in real-world AI systems. We investigate people's preferences for White, Black, Hispanic, and Asian candidates (represented through names and affinity groups on quality-controlled resumes) across 1,526 scenarios and measure their unconscious associations between race and status using implicit association tests (IATs), which predict discriminatory hiring decisions but have not been investigated in human-AI collaboration. When making decisions without AI or with AI that exhibits no race-based preferences, people select all candidates at equal rates. However, when interacting with AI favoring a particular group, people also favor those candidates up to 90% of the time, indicating a significant behavioral shift. The likelihood of selecting candidates whose identities do not align with common race-status stereotypes can increase by 13% if people complete an IAT before conducting resume screening. Finally, even if people think AI recommendations are low quality or not important, their decisions are still vulnerable to AI bias under certain circumstances. This work has implications for people's autonomy in AI-HITL scenarios, AI and work, design and evaluation of AI hiring systems, and strategies for mitigating bias in collaborative decision-making tasks. In particular, organizational and regulatory policy should acknowledge the complex nature of AI-HITL decision making when implementing these systems, educating people who use them, and determining which are subject to oversight.