🤖 AI Summary
This work addresses the critical issue that explanations generated by large language models (LLMs) often diverge from their underlying reasoning, undermining trustworthiness and transparency. To tackle this, we propose a collaborative consistency diagnosis framework that jointly leverages rule-based symbolic information extraction—identifying entities and logical structures—and fine-tuned LMs to automatically generate highly relevant, diverse, and targeted follow-up questions. Crucially, our approach establishes the first organic synergy between symbolic systems and LLMs in follow-up question generation, markedly improving detection of internal contradictions and critical omissions within explanations. Evaluated across multiple explanation evaluation benchmarks, our method achieves a 27.4% absolute gain in inconsistent explanation identification accuracy over strong LLM-only baselines and existing detection techniques. This work introduces an efficient, scalable, and principled paradigm for consistency assessment in explainable AI.
📝 Abstract
Large Language Models (LLMs) are often asked to explain their outputs to enhance accuracy and transparency. However, evidence suggests that these explanations can misrepresent the models' true reasoning processes. One effective way to identify inaccuracies or omissions in these explanations is through consistency checking, which typically involves asking follow-up questions. This paper introduces, cross-examiner, a new method for generating follow-up questions based on a model's explanation of an initial question. Our method combines symbolic information extraction with language model-driven question generation, resulting in better follow-up questions than those produced by LLMs alone. Additionally, this approach is more flexible than other methods and can generate a wider variety of follow-up questions.