Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

📅 2025-01-08

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the lack of human-like systematic reasoning—akin to System 2 thinking—in large language models (LLMs). To this end, we propose Meta-CoT, the first framework that elevates chain-of-thought (CoT) reasoning from answer generation to explicit modeling of the reasoning process itself. Meta-CoT explicitly learns and generates interpretable, verifiable, and optimizable reasoning paths—not merely final answers. Methodologically, it introduces an end-to-end training pipeline integrating process-supervised fine-tuning, synthetic reasoning trajectory generation, linearized search instruction tuning, and reinforcement learning. Evaluated on multiple complex reasoning benchmarks, Meta-CoT exhibits emergent implicit search behavior, yielding substantial improvements in both generalization and interpretability. This work establishes a novel paradigm and practical pathway toward controllable, deep, and traceable System 2 reasoning in LLMs.

Technology Category

Application Category

📝 Abstract

We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT. We present empirical evidence from state-of-the-art models exhibiting behaviors consistent with in-context search, and explore methods for producing Meta-CoT via process supervision, synthetic data generation, and search algorithms. Finally, we outline a concrete pipeline for training a model to produce Meta-CoTs, incorporating instruction tuning with linearized search traces and reinforcement learning post-training. Finally, we discuss open research questions, including scaling laws, verifier roles, and the potential for discovering novel reasoning algorithms. This work provides a theoretical and practical roadmap to enable Meta-CoT in LLMs, paving the way for more powerful and human-like reasoning in artificial intelligence.

Problem

Research questions and friction points this paper is trying to address.

Deep Thinking

Language Models

Human-like Cognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-CoT

Advanced Model Training

Human-like Thinking Enhancement

🔎 Similar Papers

Reasoning with Large Language Models, a Survey