🤖 AI Summary
This paper addresses hiring bias in LLM-based job-resume matching—specifically biases arising from gender, race, and educational background—within the U.S. HR context, conducting a systematic fairness evaluation. Methodologically, we empirically assess leading models (GPT-4, Claude, Llama) using controlled prompt engineering, counterfactual perturbations, and multidimensional fairness metrics—including demographic parity and equalized odds. Our key finding is that explicit biases (gender/race) have substantially diminished (bias < 8.5%), whereas implicit educational background bias remains pronounced (mean 37.2%, *p* < 0.001). Our primary contributions are: (1) the first multidimensional fairness evaluation framework tailored to HR applications of LLMs; (2) a reproducible benchmark for bias diagnosis; and (3) actionable mitigation pathways. Together, these provide both theoretical grounding and practical guidance for deploying fairer LLM-powered recruitment tools in industry.
📝 Abstract
Large Language Models (LLMs) offer the potential to automate hiring by matching job descriptions with candidate resumes, streamlining recruitment processes, and reducing operational costs. However, biases inherent in these models may lead to unfair hiring practices, reinforcing societal prejudices and undermining workplace diversity. This study examines the performance and fairness of LLMs in job-resume matching tasks within the English language and U.S. context. It evaluates how factors such as gender, race, and educational background influence model decisions, providing critical insights into the fairness and reliability of LLMs in HR applications. Our findings indicate that while recent models have reduced biases related to explicit attributes like gender and race, implicit biases concerning educational background remain significant. These results highlight the need for ongoing evaluation and the development of advanced bias mitigation strategies to ensure equitable hiring practices when using LLMs in industry settings.