🤖 AI Summary
In Chinese question-answering systems, user input errors often lead to large language models misinterpreting intent or over-correcting question structure, degrading answer accuracy. To address this, we propose a knowledge-enhanced and reinforcement learning (RL)-guided error correction framework. First, QuestionRAG integrates external knowledge—namely search results and relevant entities—to mitigate intent misclassification. Second, an RL-based alignment optimization module refines erroneous questions while preserving their original semantic structure. Crucially, this approach avoids heavy reliance on labeled data for supervised fine-tuning. Experiments demonstrate that knowledge enhancement significantly improves intent recognition, while the RL policy achieves superior correction accuracy and structural fidelity compared to conventional fine-tuning baselines. Overall, the framework substantially enhances model robustness and generalization in handling erroneous Chinese questions, fully unlocking the potential of large language models for Chinese QA error correction.
📝 Abstract
Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinterpretation) or unnecessarily altering the original question's structure (over-correction). We propose QuestionRAG, a framework that tackles these problems. To address misinterpretation, it enriches the input with external knowledge (e.g., search results, related entities). To prevent over-correction, it uses reinforcement learning (RL) to align the model's objective with precise correction, not just paraphrasing. Our results demonstrate that knowledge augmentation is critical for understanding faulty questions. Furthermore, RL-based alignment proves significantly more effective than traditional supervised fine-tuning (SFT), boosting the model's ability to follow instructions and generalize. By integrating these two strategies, QuestionRAG unlocks the full potential of LLMs for the question correction task.