Quantifying and Mitigating Self-Preference Bias of LLM Judges

📅 2026-04-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
This study addresses the self-preference bias (SPB) inherent in large language models (LLMs) when used as evaluators—a systematic tendency to favor their own generated responses, which undermines the reliability of automated evaluation. The work presents the first fully automated framework to quantify SPB by constructing equally high-quality response pairs without human annotation. Leveraging statistical methods, it disentangles an LLM’s discriminative capability from its biased preference, and introduces a multidimensional, structured evaluation paradigm grounded in cognitive load decomposition to mitigate SPB. Experiments across 20 mainstream LLMs demonstrate an average SPB reduction of 31.5%, while revealing that model capability exhibits no positive correlation—and may even be negatively correlated—with SPB severity, thereby offering a novel pathway toward trustworthy automatic evaluation.

Technology Category

Application Category

📝 Abstract
LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing critical roles in model alignment, leaderboard construction, quality control, and so on. However, the scalability and trustworthiness of this approach can be substantially distorted by Self-Preference Bias (SPB), which is a directional evaluative deviation in which LLMs systematically favor or disfavor their own generated outputs during evaluation. Existing measurements rely on costly human annotations and conflate generative capability with evaluative stance, and thus are impractical for large-scale deployment in real-world systems. To address this issue, we introduce a fully automated framework to quantifying and mitigating SPB, which constructs equal-quality pairs of responses with negligible quality differences, enabling statistical disentanglement of discriminability from bias propensity without human gold standards. Empirical analysis across 20 mainstream LLMs reveals that advanced capabilities are often uncorrelated, or even negatively correlated, with low SPB. To mitigate this bias, we propose a structured multi-dimensional evaluation strategy grounded in cognitive load decomposition, which reduces SPB by 31.5\% on average.
Problem

Research questions and friction points this paper is trying to address.

Self-Preference Bias
LLM-as-a-Judge
automated evaluation
evaluation bias
model alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Preference Bias
LLM-as-a-Judge
bias mitigation
automated evaluation
cognitive load decomposition
🔎 Similar Papers
No similar papers found.
J
Jinming Yang
CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
C
Chuxian Qiu
CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Z
Zhenyu Deng
CompleX Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Xinshan Jiao
Xinshan Jiao
University of Electronic Science and Technology of China
Complex Network
Tao Zhou
Tao Zhou
Web Sciences Center
networkshuman dynamicsrecommendationprediction