When and What to Ask: AskBench and Rubric-Guided RLVR for LLM Clarification

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the susceptibility of large language models to hallucination when confronted with missing or misleading information, stemming from their limited ability to proactively seek clarification. To tackle this issue, the authors introduce AskBench—the first interactive evaluation benchmark specifically designed for scenarios involving ambiguous user intent and model overconfidence—and propose RLVR, a reinforcement learning approach guided by structured scoring rules. Within a unified evaluation loop, RLVR employs a verifier to encourage the model to ask precise clarification questions at appropriate moments, jointly optimizing clarification behavior and task performance. Experiments demonstrate that this method significantly enhances model accuracy, adherence to instructions, and interaction efficiency, while also exhibiting strong generalization capabilities in unseen domains.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) often respond even when prompts omit critical details or include misleading information, leading to hallucinations or reinforced misconceptions. We study how to evaluate and improve LLMs'ability to decide when and what to ask for clarification without sacrificing task performance. We introduce AskBench, an interactive benchmark that converts standard QA pairs into multi-turn interactions with explicit checkpoints. A unified judge loop evaluates final answers and simulates user responses as needed. AskBench covers two settings: AskMind, with intent-deficient queries requiring clarification, and AskOverconfidence, with queries containing false premises that must be identified and corrected. We further propose rubric-guided reinforcement learning with verifier-based rewards (RLVR), which uses structured rubrics to encourage targeted clarification. Experiments show consistent improvements in accuracy, rubric adherence, and interaction efficiency, with strong generalization to unseen domains.
Problem

Research questions and friction points this paper is trying to address.

large language models
clarification
hallucination
misconceptions
interactive evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

AskBench
RLVR
clarification
rubric-guided reinforcement learning
interactive evaluation
🔎 Similar Papers
No similar papers found.
J
Jiale Zhao
Chongqing University of Posts and Telecommunications
K
Ke Fang
University of Pennsylvania
Lu Cheng
Lu Cheng
Assistant Professor, UIC CS
Socially Responsible AICausal Machine LearningData MiningAI for Good