Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLMs

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing mathematical reasoning methods rely on fixed paradigms—such as natural language inference (NLI), code augmentation, tool invocation, or ensemble-based reasoning—struggling to balance effectiveness and efficiency. To address this, we propose PRISM, a two-stage framework that decouples reasoning into *strategy planning* and *goal execution*. PRISM introduces the first problem-aware, confidence-driven dynamic routing mechanism: it adaptively selects among single-strategy execution, dual-strategy verification, or multi-strategy exploration based on the predicted confidence distribution over candidate strategies. Leveraging the newly constructed MathStrat dataset—a multi-strategy preference benchmark—we train a lightweight strategy adapter that seamlessly integrates four mainstream reasoning paradigms. Evaluated across five mathematical reasoning benchmarks, PRISM consistently outperforms both single-paradigm and ensemble baselines, achieving absolute improvements of 0.9–7.6% across diverse foundation models, demonstrating strong generalization and cross-architecture robustness.

Technology Category

Application Category

📝 Abstract
Existing methods usually leverage a fixed strategy, such as natural language reasoning, code-augmented reasoning, tool-integrated reasoning, or ensemble-based reasoning, to guide Large Language Models (LLMs) to perform mathematical reasoning. Our analysis reveals that the single strategy cannot adapt to problem-specific requirements and thus overlooks the trade-off between effectiveness and efficiency. To address these issues, we propose Planning and Routing through Instance-Specific Modeling (PRISM), a novel framework that decouples mathematical reasoning into two stages: strategy planning and targeted execution. Specifically, we first curate a multi-strategy preference dataset, which we call MathStrat, capturing correctness, process quality, and computational efficiency for each problem--strategy pair. Then, we train a lightweight Strategy Adapter based on the dataset to obtain confidence distributions over the mentioned four reasoning strategies. At inference time, an adaptive routing policy dynamically tailors the reasoning approach based on predictor confidence. It directs the model to use single-strategy execution for high-confidence predictions, dual-strategy verification for competitive scenarios, or comprehensive multi-strategy exploration for uncertain cases. Extensive experiments across five mathematical reasoning benchmarks demonstrate that PRISM consistently outperforms individual strategies and ensemble baselines, achieving improvements ranging from 0.9% to 7.6% across different base models. The adaptive routing approach shows particularly strong benefits for mathematical reasoning tasks across diverse model architectures. Our code is released at https://github.com/reml-group/PRISM.
Problem

Research questions and friction points this paper is trying to address.

Adapting reasoning strategies to specific mathematical problem requirements
Overcoming limitations of fixed single-strategy approaches in mathematical reasoning
Balancing effectiveness and efficiency trade-offs in mathematical reasoning methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage framework decouples planning and execution
Lightweight Strategy Adapter predicts confidence for strategies
Adaptive routing policy dynamically tailors reasoning approach
🔎 Similar Papers
No similar papers found.
S
Shihao Qi
School of Computer Science and Technology, Xi’an Jiaotong University
J
Jie Ma
MOE KLINNS Lab, Xi’an Jiaotong University
Ziang Yin
Ziang Yin
ASU PhD Student in Computer Engineering
Lingling Zhang
Lingling Zhang
Assistant Professor, Xi'an Jiaotong University
Computer visionFew-shot learningZero-shot learning
J
Jian Zhang
School of Computer Science and Technology, Xi’an Jiaotong University
J
Jun Liu
School of Computer Science and Technology, Xi’an Jiaotong University
F
Feng Tian
School of Computer Science and Technology, Xi’an Jiaotong University
Tongliang Liu
Tongliang Liu
Director, Sydney AI Centre, University of Sydney & Mohamed bin Zayed University of AI
Machine LearningLearning with Noisy LabelsTrustworthy Machine Learning