Meta-R1: Empowering Large Reasoning Models with Metacognition

📅 2025-08-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current large reasoning models (LRMs) lack explicit metacognitive mechanisms, resulting in uncontrolled, error-prone reasoning processes unsupported by methodological foundations. To address this, we propose Meta-R1—a novel framework that systematically integrates principles from cognitive science—specifically metacognition—into LRMs for the first time. Meta-R1 introduces a dual-layer reasoning architecture with strict separation between an object layer (task execution) and a meta-layer (reasoning control), enabling active planning, online monitoring, error detection, dynamic strategy adaptation, and self-terminating inference. This design significantly enhances reasoning controllability, reliability, and flexibility. Experiments across three challenging benchmarks demonstrate that Meta-R1 achieves an average performance gain of 27.3%, reduces token consumption by 15.7%–32.7%, improves inference efficiency by 14.8%, and exhibits strong generalizability across diverse datasets and model architectures.

Technology Category

Application Category

📝 Abstract
Large Reasoning Models (LRMs) demonstrate remarkable capabilities on complex tasks, exhibiting emergent, human-like thinking patterns. Despite their advances, we identify a fundamental limitation: current LRMs lack a dedicated meta-level cognitive system-an essential faculty in human cognition that enables "thinking about thinking". This absence leaves their emergent abilities uncontrollable (non-adaptive reasoning), unreliable (intermediate error), and inflexible (lack of a clear methodology). To address this gap, we introduce Meta-R1, a systematic and generic framework that endows LRMs with explicit metacognitive capabilities. Drawing on principles from cognitive science, Meta-R1 decomposes the reasoning process into distinct object-level and meta-level components, orchestrating proactive planning, online regulation, and adaptive early stopping within a cascaded framework. Experiments on three challenging benchmarks and against eight competitive baselines demonstrate that Meta-R1 is: (I) high-performing, surpassing state-of-the-art methods by up to 27.3%; (II) token-efficient, reducing token consumption to 15.7% ~ 32.7% and improving efficiency by up to 14.8% when compared to its vanilla counterparts; and (III) transferable, maintaining robust performance across datasets and model backbones.
Problem

Research questions and friction points this paper is trying to address.

LRMs lack metacognitive capabilities for adaptive reasoning
Current models exhibit unreliable intermediate error handling
Inflexible reasoning processes without clear methodology
Innovation

Methods, ideas, or system contributions that make the work stand out.

Metacognitive framework for reasoning models
Decomposes reasoning into object and meta levels
Proactive planning with adaptive early stopping
🔎 Similar Papers
No similar papers found.
H
Haonan Dong
State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
Haoran Ye
Haoran Ye
AI PhD @ Peking University
AgentAI Safety and AlignmentAI PsychologyLearn to OptimizeEvolutionary Computation
Wenhao Zhu
Wenhao Zhu
ByteDance Seed
Large Language ModelMachine Translation
K
Kehan Jiang
State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
Guojie Song
Guojie Song
Professor (Research), Tenured of Peking University
Psychological AIAI Safe & Value AlignmentAgent Cognition & Behavioral ModelingLLM&GML