🤖 AI Summary
This work identifies and quantifies a significant positional bias in large language models (LLMs) during financial binary decision-making, undermining reliability in high-stakes financial applications. To address this, we introduce the first unified framework and benchmark for detecting positional bias in financial LLMs—applied to the open-source Qwen2.5-Instruct series (1.5B–14B)—integrating mechanistic interpretability, attention pattern analysis, and hierarchical gradient tracing. We conduct systematic attribution using a novel, high-fidelity financial dataset curated for realism and task relevance. Our analysis reveals, for the first time, that bias magnitude scales with model size and can be reactivated (“bias resurgence”) via suboptimal prompt design; we further localize critical network components and propagation pathways responsible. These findings establish a new standard for positional bias detection in financial LLMs and yield actionable mitigation strategies—substantially improving model interpretability and deployment robustness.
📝 Abstract
The growing adoption of large language models (LLMs) in finance exposes high-stakes decision-making to subtle, underexamined positional biases. The complexity and opacity of modern model architectures compound this risk. We present the first unified framework and benchmark that not only detects and quantifies positional bias in binary financial decisions but also pinpoints its mechanistic origins within open-source Qwen2.5-instruct models (1.5B--14B). Our empirical analysis covers a novel, finance-authentic dataset revealing that positional bias is pervasive, scale-sensitive, and prone to resurfacing under nuanced prompt designs and investment scenarios, with recency and primacy effects revealing new vulnerabilities in risk-laden contexts. Through transparent mechanistic interpretability, we map how and where bias emerges and propagates within the models to deliver actionable, generalizable insights across prompt types and scales. By bridging domain-specific audit with model interpretability, our work provides a new methodological standard for both rigorous bias diagnosis and practical mitigation, establishing essential guidance for responsible and trustworthy deployment of LLMs in financial systems.