Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the challenge of balancing computational efficiency and accuracy in large language models (LLMs) for complex reasoning tasks, this paper proposes a difficulty-aware dynamic model routing framework. Methodologically, it trains a lightweight difficulty predictor on intermediate representations extracted from the S1.1-32B model to perform fine-grained difficulty estimation for mathematical reasoning problems; inputs are then dynamically routed to appropriately sized smaller models based on predicted difficulty. This work is the first to leverage intermediate representations for difficulty modeling and dynamic model allocation in complex reasoning tasks. Experiments across multiple mathematical reasoning benchmarks demonstrate that the approach achieves accuracy comparable to—or even exceeding—that of S1.1-32B, while reducing total computational cost by up to 67% in FLOPs. It significantly outperforms both random and static model allocation baselines.

Technology Category

Application Category

📝 Abstract

Reasoning language models perform well on complex tasks but are costly to deploy due to their size and long reasoning traces. We propose a routing approach that assigns each problem to the smallest model likely to solve it, reducing compute without sacrificing accuracy. Using intermediate representations from s1.1-32B, we train lightweight predictors of problem difficulty or model correctness to guide routing across a pool of reasoning models. On diverse math benchmarks, routing improves efficiency over random assignment and matches s1.1-32B's performance while using significantly less compute. Our results demonstrate that difficulty-aware routing is effective for cost-efficient deployment of reasoning models.

Problem

Research questions and friction points this paper is trying to address.

Predicting problem difficulty to optimize model selection

Reducing computational costs of large reasoning language models

Routing problems to smallest capable model while maintaining accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses difficulty prediction for routing problems

Assigns problems to smallest capable model

Reduces compute while maintaining accuracy

🔎 Similar Papers

No similar papers found.

Authors to Follow