MixRea: Benchmarking Explicit-Implicit Reasoning in Large Language Models

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study addresses a critical cognitive limitation in large language models (LLMs)—their tendency to overlook crucial implicit context when processing explicit instructions, analogous to human inattentional blindness. To systematically evaluate this phenomenon, the authors introduce inattentional blindness into LLM assessment for the first time, proposing a novel explicit-implicit reasoning task and constructing MixRea, a benchmark comprising 2,246 multiple-choice questions. They further develop a Prompting method for Latent Relationship Completion (PRCP) to recover neglected causal cues. Comprehensive evaluation across 21 state-of-the-art models reveals that even the best-performing model, Gemini 2.5 Pro, achieves only 42.8% consistency, underscoring the pervasiveness of this limitation. The PRCP approach significantly enhances models’ ability to leverage implicit information, demonstrating its effectiveness in mitigating inattentional blindness in LLMs.

📝 Abstract

Large language models (LLMs) are increasingly integrated into high-stakes decision-making. Inspired by the theory of \emph{inattentional blindness} in human cognition, we investigate whether LLMs, trained on human-preferred corpora that embed attentional biases, exhibit a similar limitation: \emph{failing to attend to subtle yet important contextual cues under explicit task instructions}. To evaluate this, we introduce the task of \textbf{explicit-implicit reasoning} and present \textbf{MixRea}, a benchmark of 2,246 multiple-choice questions across 9 reasoning types with varying distributions of explicit and implicit information. Evaluation of 21 advanced LLMs shows that even the best-performing reasoning model (Gemini 2.5 Pro) achieves only 42.8\% consistency, revealing widespread inattentional blindness. To mitigate this, we propose \textbf{Potential Relation Completion Prompting (PRCP)}, a prompting method that improves reasoning by recovering overlooked causal relations. Further analysis shows that this limitation persists across diverse multi-source reasoning tasks, highlighting the need for more cognitively aligned models.

Problem

Research questions and friction points this paper is trying to address.

inattentional blindness

explicit-implicit reasoning

large language models

contextual cues

reasoning consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

explicit-implicit reasoning

inattentional blindness

MixRea benchmark