Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

πŸ“… 2025-11-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Hallucination detection in large language models (LLMs) faces two key challenges: rigid verification strategies that fail to adapt to dynamic execution contexts, and prohibitively high computational costs when relying on proprietary LLMs (e.g., GPT-4) for validation. To address these, we propose LEAPβ€”a novel framework that formulates hallucination detection as a dynamic policy learning problem, enabling lightweight open-source student models to autonomously adapt their verification strategies. LEAP integrates a teacher-student architecture, dynamic learning loops, trajectory generation, policy distillation, and active correction mechanisms, achieving robust capability transfer via proxy-based fine-tuning. Evaluated on three challenging benchmarks, LEAP consistently outperforms existing state-of-the-art methods, delivering substantial gains in both detection accuracy and strategy adaptability while maintaining low inference overhead.

Technology Category

Application Category

πŸ“ Abstract
Hallucination in large language models (LLMs) remains a critical barrier to their safe deployment. Existing tool-augmented hallucination detection methods require pre-defined fixed verification strategies, which are crucial to the quality and effectiveness of tool calls. Some methods directly employ powerful closed-source LLMs such as GPT-4 as detectors, which are effective but too costly. To mitigate the cost issue, some methods adopt the teacher-student architecture and finetune open-source small models as detectors via agent tuning. However, these methods are limited by fixed strategies. When faced with a dynamically changing execution environment, they may lack adaptability and inappropriately call tools, ultimately leading to detection failure. To address the problem of insufficient strategy adaptability, we propose the innovative ``Learning to Evaluate and Adaptively Plan''(LEAP) framework, which endows an efficient student model with the dynamic learning and proactive correction capabilities of the teacher model. Specifically, our method formulates the hallucination detection problem as a dynamic strategy learning problem. We first employ a teacher model to generate trajectories within the dynamic learning loop and dynamically adjust the strategy based on execution failures. We then distill this dynamic planning capability into an efficient student model via agent tuning. Finally, during strategy execution, the student model adopts a proactive correction mechanism, enabling it to propose, review, and optimize its own verification strategies before execution. We demonstrate through experiments on three challenging benchmarks that our LEAP-tuned model outperforms existing state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Detecting hallucinations in large language models using adaptable verification strategies
Reducing costs by distilling dynamic planning into efficient small models
Enhancing strategy adaptability through proactive correction mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic learning loop for adaptive strategy adjustment
Knowledge distillation from teacher to student model
Proactive correction mechanism for strategy optimization
πŸ”Ž Similar Papers
No similar papers found.
Zepeng Bao
Zepeng Bao
Wuhan University
LLM Hallucination
Shen Zhou
Shen Zhou
Wuhan University
Q
Qiankun Pi
School of Computer Science, Wuhan University, China
J
Jianhao Chen
School of Computer Science, Wuhan University, China; Zhongguancun Academy, Beijing, China
Mayi Xu
Mayi Xu
Wuhan University
Natural Language Processing
M
Ming Zhong
School of Computer Science, Wuhan University, China
Y
Yuanyuan Zhu
School of Computer Science, Wuhan University, China
Tieyun Qian
Tieyun Qian
Wuhan University
natural language processingweb data mining