FAIRE: Assessing Racial and Gender Bias in AI-Driven Resume Evaluations

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses pervasive racial and gender biases in LLM-driven resume evaluation by proposing FAIRE—a first-of-its-kind fine-grained fairness benchmark tailored to hiring scenarios. Methodologically, it integrates LLM-based semantic modeling, identity-aware prompt engineering, and contrastive bias measurement. FAIRE enables controllable identity attribute perturbation, cross-model bias quantification, and dual-path bias detection—via direct scoring and ranking—to uncover implicit biases across multi-industry resume understanding tasks. Experimental results reveal significant, directionally heterogeneous biases across all mainstream LLMs. The project open-sources the benchmark dataset, evaluation code, and standardized protocols, establishing a reproducible, standardized infrastructure for fairness assessment of AI-powered recruitment tools.

Technology Category

Application Category

📝 Abstract

In an era where AI-driven hiring is transforming recruitment practices, concerns about fairness and bias have become increasingly important. To explore these issues, we introduce a benchmark, FAIRE (Fairness Assessment In Resume Evaluation), to test for racial and gender bias in large language models (LLMs) used to evaluate resumes across different industries. We use two methods-direct scoring and ranking-to measure how model performance changes when resumes are slightly altered to reflect different racial or gender identities. Our findings reveal that while every model exhibits some degree of bias, the magnitude and direction vary considerably. This benchmark provides a clear way to examine these differences and offers valuable insights into the fairness of AI-based hiring tools. It highlights the urgent need for strategies to reduce bias in AI-driven recruitment. Our benchmark code and dataset are open-sourced at our repository: https://github.com/athenawen/FAIRE-Fairness-Assessment-In-Resume-Evaluation.git.

Problem

Research questions and friction points this paper is trying to address.

Assessing racial and gender bias in AI resume evaluations

Measuring bias variations in large language models (LLMs)

Developing strategies to reduce AI-driven recruitment bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

FAIRE benchmark assesses bias in AI hiring

Direct scoring and ranking methods used

Open-sourced dataset and code provided

🔎 Similar Papers

No similar papers found.