🤖 AI Summary
Large language models (LLMs) frequently generate responses conflating factual content with hallucinations, severely impeding human verification and decision-making. To address this, we propose Fact-Anchored Highlight Chain-of-Thought (FA-Highlight-CoT), the first method to explicitly embed XML-style markup within CoT reasoning to highlight input facts directly supporting each inference step—thereby enforcing fact-aligned generation. FA-Highlight-CoT integrates few-shot prompting with a structured fact-anchoring mechanism. It achieves significant improvements over standard CoT across 17 diverse, cross-domain tasks. Human evaluation demonstrates that highlighting substantially enhances both accuracy and efficiency in identifying correct answers; however, it also uncovers a novel cognitive bias—“highlight credibility bias”—wherein incorrectly highlighted information paradoxically increases user confidence in falsehoods. This work establishes a new paradigm for verifiable, trustworthy LLM reasoning and provides empirically grounded insights into human-AI interaction dynamics.
📝 Abstract
An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.