Reasoning on a Spectrum: Aligning LLMs to System 1 and System 2 Thinking

📅 2025-02-18

📈 Citations: 1

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Large language models (LLMs) exhibit rigid inference behavior, lacking the human-like dynamic switching between System 1 (intuitive) and System 2 (analytic) cognitive styles, resulting in poor generalization. Method: We propose the first explicit framework that decouples and models dual-system reasoning in LLMs, constructing an interpolatable, interpretable continuous reasoning spectrum. Leveraging a 2,000-sample dual-answer annotated dataset, we integrate supervised fine-tuning, response mechanism analysis, uncertainty quantification, and style interpolation. Contribution/Results: Our analysis uncovers fundamental accuracy–efficiency trade-off patterns. Empirical evaluation shows System 2-specialized models improve arithmetic and symbolic reasoning by 12.3%, while System 1-specialized models boost commonsense reasoning accuracy by 8.7%. Crucially, interpolation preserves reasoning coherence and ensures monotonic performance variation across the spectrum.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) exhibit impressive reasoning abilities, yet their reliance on structured step-by-step processing reveals a critical limitation. While human cognition fluidly adapts between intuitive, heuristic (System 1) and analytical, deliberative (System 2) reasoning depending on the context, LLMs lack this dynamic flexibility. This rigidity can lead to brittle and unreliable performance when faced with tasks that deviate from their trained patterns. To address this, we create a dataset of 2,000 samples with valid System 1 and System 2 answers, explicitly align LLMs with these reasoning styles, and evaluate their performance across reasoning benchmarks. Our results reveal an accuracy-efficiency trade-off: System 2-aligned models excel in arithmetic and symbolic reasoning, while System 1-aligned models perform better in commonsense tasks. A mechanistic analysis of model responses shows that System 1 models employ more definitive answers, whereas System 2 models demonstrate greater uncertainty. Interpolating between these extremes produces a monotonic transition in reasoning accuracy, preserving coherence. This work challenges the assumption that step-by-step reasoning is always optimal and highlights the need for adapting reasoning strategies based on task demands.

Problem

Research questions and friction points this paper is trying to address.

Align LLMs with dual reasoning styles

Evaluate performance across reasoning benchmarks

Address accuracy-efficiency trade-off in reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligning LLMs to System 1

Creating 2000-sample reasoning dataset

Interpolating System 1 and System 2

🔎 Similar Papers

No similar papers found.