π€ AI Summary
This study addresses a critical cognitive limitation in large language models (LLMs)βtheir tendency to overlook crucial implicit context when processing explicit instructions, analogous to human inattentional blindness. To systematically evaluate this phenomenon, the authors introduce inattentional blindness into LLM assessment for the first time, proposing a novel explicit-implicit reasoning task and constructing MixRea, a benchmark comprising 2,246 multiple-choice questions. They further develop a Prompting method for Latent Relationship Completion (PRCP) to recover neglected causal cues. Comprehensive evaluation across 21 state-of-the-art models reveals that even the best-performing model, Gemini 2.5 Pro, achieves only 42.8% consistency, underscoring the pervasiveness of this limitation. The PRCP approach significantly enhances modelsβ ability to leverage implicit information, demonstrating its effectiveness in mitigating inattentional blindness in LLMs.
π Abstract
Large language models (LLMs) are increasingly integrated into high-stakes decision-making. Inspired by the theory of \emph{inattentional blindness} in human cognition, we investigate whether LLMs, trained on human-preferred corpora that embed attentional biases, exhibit a similar limitation: \emph{failing to attend to subtle yet important contextual cues under explicit task instructions}. To evaluate this, we introduce the task of \textbf{explicit-implicit reasoning} and present \textbf{MixRea}, a benchmark of 2,246 multiple-choice questions across 9 reasoning types with varying distributions of explicit and implicit information. Evaluation of 21 advanced LLMs shows that even the best-performing reasoning model (Gemini 2.5 Pro) achieves only 42.8\% consistency, revealing widespread inattentional blindness. To mitigate this, we propose \textbf{Potential Relation Completion Prompting (PRCP)}, a prompting method that improves reasoning by recovering overlooked causal relations. Further analysis shows that this limitation persists across diverse multi-source reasoning tasks, highlighting the need for more cognitively aligned models.