🤖 AI Summary
This work identifies significant racial bias in large language models (LLMs) when simulating mortgage approval decisions—posing risks of systemic financial inequity. To address this, we propose the first reproducible counterfactual testing framework for LLMs, integrating inter-layer sensitive attribute tracing with control vector intervention (CVI) to decouple bias mitigation from performance optimization. Leveraging counterfactual reasoning and layer-wise representation analysis, we quantitatively demonstrate—for the first time—that LLM-based credit decisions exhibit racially skewed outcomes exceeding historical benchmarks. CVI reduces racial disparity by an average of 33% (up to 70%) while preserving model accuracy. Our framework establishes a novel paradigm for fairness assessment and controllable debiasing of LLMs in high-stakes financial applications, offering both methodological rigor and scalable technical pathways for equitable AI deployment.
📝 Abstract
Financial institutions increasingly rely on large language models (LLMs) for high-stakes decision-making. However, these models risk perpetuating harmful biases if deployed without careful oversight. This paper investigates racial bias in LLMs specifically through the lens of credit decision-making tasks, operating on the premise that biases identified here are indicative of broader concerns across financial applications. We introduce a reproducible, counterfactual testing framework that evaluates how models respond to simulated mortgage applicants identical in all attributes except race. Our results reveal significant race-based discrepancies, exceeding historically observed bias levels. Leveraging layer-wise analysis, we track the propagation of sensitive attributes through internal model representations. Building on this, we deploy a control-vector intervention that effectively reduces racial disparities by up to 70% (33% on average) without impairing overall model performance. Our approach provides a transparent and practical toolkit for the identification and mitigation of bias in financial LLM deployments.