(How) Do reasoning models reason?

📅 2025-04-12

🏛️ Annals of the New York Academy of Sciences

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work systematically investigates the reasoning mechanisms of large reasoning models (e.g., OpenAI o1, DeepSeek R1) to clarify their capability origins, effective boundaries, and prevalent misconceptions. We propose the first unified analytical framework integrating behavioral evaluation, architectural inversion, chain-of-thought tracing, and computational trajectory visualization. Empirical analysis reveals that their reasoning proficiency stems primarily from search-augmented inference and strategic computation delay—not implicit logical deduction. We identify, for the first time, the models’ genuine strengths and systematic failure modes in mathematical and symbolic reasoning, thereby refuting the widespread misconception that they possess intrinsic logical reasoning ability. Our findings establish an interpretable theoretical benchmark and methodological foundation for designing and evaluating trustworthy reasoning models. (132 words)

Technology Category

Application Category

📝 Abstract

We provide a broad unifying perspective on the recent breed of large reasoning models such as OpenAI o1 and DeepSeek R1, including their promise, sources of power, misconceptions, and limitations.

Problem

Research questions and friction points this paper is trying to address.

Understanding reasoning mechanisms in Large Reasoning Models

Evaluating strengths and limitations of recent LRMs

Clarifying misconceptions about model capabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Reasoning Models (LRMs) for broad perspective

OpenAI o1 and DeepSeek R1 analysis

Examining power sources and limitations

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting