TrigReason: Trigger-Based Collaboration between Small and Large Reasoning Models

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work addresses the limitations of both small and large language models in complex reasoning: small models often suffer from path deviation, cognitive overload, and poor error correction, while large models incur high latency and substantial computational costs. To reconcile these trade-offs, the authors propose a trigger-based collaborative reasoning framework wherein a small model leads the inference process and dynamically invokes a large model at critical junctures via three novel triggering mechanisms—strategic guidance, cognitive offloading, and intervention requests. The framework integrates path monitoring, confidence estimation, and loop detection to enable efficient and precise resource allocation. Experiments demonstrate that this approach achieves accuracy comparable to using a large model exclusively on AIME24/25 and GPQA-D benchmarks, while reducing reasoning steps by 1.70–4.79×, cutting latency by 43.9% in edge-cloud settings, and lowering API costs by 73.3%.

Technology Category

Application Category

📝 Abstract

Large Reasoning Models (LRMs) achieve strong performance on complex tasks through extended chains of thought but suffer from high inference latency due to autoregressive reasoning. Recent work explores using Small Reasoning Models (SRMs) to accelerate LRM inference. In this paper, we systematically characterize the capability boundaries of SRMs and identify three common types of reasoning risks: (1) path divergence, where SRMs lack the strategic ability to construct an initial plan, causing reasoning to deviate from the most probable path; (2) cognitive overload, where SRMs fail to solve particularly difficult steps; and (3) recovery inability, where SRMs lack robust self-reflection and error correction mechanisms. To address these challenges, we propose TrigReason, a trigger-based collaborative reasoning framework that replaces continuous polling with selective intervention. TrigReason delegates most reasoning to the SRM and activates LRM intervention only when necessary-during initial strategic planning (strategic priming trigger), upon detecting extraordinary overconfidence (cognitive offload trigger), or when reasoning falls into unproductive loops (intervention request trigger). The evaluation results on AIME24, AIME25, and GPQA-D indicate that TrigReason matches the accuracy of full LRMs and SpecReason, while offloading 1.70x - 4.79x more reasoning steps to SRMs. Under edge-cloud conditions, TrigReason reduces latency by 43.9\% and API cost by 73.3\%. Our code is available at \href{https://github.com/QQQ-yi/TrigReason}{https://github.com/QQQ-yi/TrigReason}

Problem

Research questions and friction points this paper is trying to address.

reasoning risks

path divergence

cognitive overload

recovery inability

model collaboration

Innovation

Methods, ideas, or system contributions that make the work stand out.

trigger-based collaboration

reasoning risk

selective intervention