Thinking in a Crowd: How Auxiliary Information Shapes LLM Reasoning

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This study investigates the causal impact of external auxiliary information—categorized as beneficial, irrelevant, or misleading—on the stepwise reasoning process of large language models (LLMs), revealing that misleading information amplifies errors through distorted “reasoning patterns,” thereby challenging the implicit assumption that longer reasoning inherently improves performance. We introduce SciAux, a novel benchmark built upon ScienceQA, enabling controlled causal analysis of how information quality intervenes in reasoning paths. Through systematic ablation and counterfactual experiments, we demonstrate that beneficial information consistently improves accuracy, whereas misleading information induces catastrophic performance degradation; critically, current LLMs exhibit no robust capacity for critical evaluation of auxiliary information quality. Our core contribution is the first formal establishment of a causal chain linking auxiliary information → reasoning path → output error. We further advocate for “critical reasoning architectures” explicitly designed to assess, filter, and adaptively integrate external information—marking a foundational step toward epistemically aware LLMs.

Technology Category

Application Category

📝 Abstract

The capacity of Large Language Models (LLMs) to reason is fundamental to their application in complex, knowledge-intensive domains. In real-world scenarios, LLMs are often augmented with external information that can be helpful, irrelevant, or even misleading. This paper investigates the causal impact of such auxiliary information on the reasoning process of LLMs with explicit step-by-step thinking capabilities. We introduce SciAux, a new dataset derived from ScienceQA, to systematically test the robustness of the model against these types of information. Our findings reveal a critical vulnerability: the model's deliberative "thinking mode" is a double-edged sword. While helpful context improves accuracy, misleading information causes a catastrophic drop in performance, which is amplified by the thinking process. Instead of conferring robustness, thinking reinforces the degree of error when provided with misinformation. This highlights that the challenge is not merely to make models "think", but to endow them with the critical faculty to evaluate the information upon which their reasoning is based. The SciAux dataset is available at https://huggingface.co/datasets/billhdzhao/SciAux.

Problem

Research questions and friction points this paper is trying to address.

Investigating how external information affects LLM reasoning processes

Testing model robustness against helpful, irrelevant, or misleading data

Addressing how thinking modes amplify errors from misinformation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces SciAux dataset from ScienceQA

Tests model robustness against auxiliary information

Reveals thinking mode amplifies misleading information errors

🔎 Similar Papers

No similar papers found.