FairJudge: An Adaptive, Debiased, and Consistent LLM-as-a-Judge

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the susceptibility of existing large language models (LLMs) as evaluators to non-semantic cues—such as response position, length, and formatting—which leads to poor adaptability and inconsistency across evaluation settings. To mitigate these issues, the authors propose modeling the judging behavior as a learnable and regularized policy. They construct a high-information-density evaluation dataset and introduce a curriculum-based, multi-stage alignment framework that integrates supervised fine-tuning (SFT), direct preference optimization (DPO), and group relative policy optimization (GRPO). This co-optimized approach effectively suppresses non-semantic biases and enhances cross-modal consistency. Experimental results demonstrate that the proposed method significantly outperforms larger instruction-tuned models on multiple internal and external benchmarks, achieving substantial improvements in both judging consistency—measured by F1 score and agreement rate—and debiasing capability.

Technology Category

Application Category

📝 Abstract
Existing LLM-as-a-Judge systems suffer from three fundamental limitations: limited adaptivity to task- and domain-specific evaluation criteria, systematic biases driven by non-semantic cues such as position, length, format, and model provenance, and evaluation inconsistency that leads to contradictory judgments across different evaluation modes (e.g., pointwise versus pairwise). To address these issues, we propose FairJudge, an adaptive, debiased, and consistent LLM-as-a-Judge. Unlike prior approaches that treat the judge as a static evaluator, FairJudge models judging behavior itself as a learnable and regularized policy. From a data-centric perspective, we construct a high-information-density judging dataset that explicitly injects supervision signals aligned with evaluation behavior. Building on this dataset, we adopt a curriculum-style SFT-DPO-GRPO training paradigm that progressively aligns rubric adherence, bias mitigation, and cross-mode consistency, while avoiding catastrophic forgetting. Experimental results on multiple internal and public benchmarks show that FairJudge consistently improves agreement and F1, reduces non-semantic biases, and outperforms substantially larger instruction-tuned LLMs. All resources will be publicly released after acceptance to facilitate future research.
Problem

Research questions and friction points this paper is trying to address.

LLM-as-a-Judge
evaluation bias
adaptivity
consistency
non-semantic cues
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-as-a-Judge
bias mitigation
evaluation consistency
curriculum training
learnable judging policy
🔎 Similar Papers
No similar papers found.
B
Bo Yang
College of Computer Science, Zhejiang University
L
Lanfei Feng
College of Computer Science, Zhejiang University
Y
Yunkui Chen
College of Computer Science, Zhejiang University
X
Xiao Xu
College of Computer Science, Zhejiang University
Yu Zhang
Yu Zhang
Associate Professor, Zhejiang University
SLAM3D VisionRobotics
Shijian Li
Shijian Li
zhejiang university
pervasive computinghuman computer interactionartificial intelligence