🤖 AI Summary
Large language models (LLMs) often suffer from limited generalization and poor interpretability in mathematical reasoning due to reliance on a single reasoning paradigm. To address this, we propose Chain-of-Reasoning (CoR), a multi-paradigm collaborative reasoning framework that unifies natural-language reasoning, symbolic computation, and algorithmic execution within a coherent generation and consistency-aware aggregation mechanism. Our approach introduces three key innovations: (1) progressive paradigm training (PPT) to jointly optimize heterogeneous reasoning paths; (2) symbolic constraint injection and algorithmic template guidance to enhance formal correctness; and (3) a multi-answer synthesis mechanism to improve robustness. Evaluated under zero-shot cross-task settings, the CoR-Math-7B model achieves a 41.0% absolute improvement over GPT-4 in theorem proving and outperforms state-of-the-art reinforcement learning methods by 7.9% on arithmetic tasks. Overall, CoR significantly advances comprehensive mathematical reasoning capability while ensuring greater transparency and interpretability of reasoning pathways.
📝 Abstract
Large Language Models (LLMs) have made notable progress in mathematical reasoning, yet they often rely on single-paradigm reasoning that limits their effectiveness across diverse tasks. In this paper, we introduce Chain-of-Reasoning (CoR), a novel unified framework that integrates multiple reasoning paradigms--Natural Language Reasoning (NLR), Algorithmic Reasoning (AR), and Symbolic Reasoning (SR)--to enable synergistic collaboration. CoR generates multiple potential answers using different reasoning paradigms and synthesizes them into a coherent final solution. We propose a Progressive Paradigm Training (PPT) strategy that allows models to progressively master these paradigms, culminating in the development of CoR-Math-7B. Experimental results demonstrate that CoR-Math-7B significantly outperforms current SOTA models, achieving up to a 41.0% absolute improvement over GPT-4 in theorem proving tasks and a 7.9% improvement over RL-based methods in arithmetic tasks. These results showcase the enhanced mathematical comprehensive ability of our model, achieving significant performance gains on specific tasks and enabling zero-shot generalization across tasks.