🤖 AI Summary
This work addresses the low efficiency and high computational cost of chain-of-thought (CoT) reasoning transfer across large language models (LLMs). We propose an adaptive reasoning summarization framework integrating semantic segmentation, importance scoring, budget-aware dynamic compression, and coherence reconstruction, augmented by Gaussian process Bayesian optimization for token-level resource allocation. Our key contributions are: (1) uncovering a power-law relationship between model scale and cross-domain robustness; and (2) designing a lightweight transfer mechanism that preserves critical reasoning paths without fine-tuning the target model. Evaluated on 7,501 medical examination questions spanning 10 specialties, our method achieves up to a 40% accuracy gain under identical token budgets. It demonstrates strong generalization and cross-architecture compatibility across 64 LLM transfer pairs from eight model families.
📝 Abstract
Chain-of-Thought (CoT) reasoning enhances the problem-solving ability of large language models (LLMs) but leads to substantial inference overhead, limiting deployment in resource-constrained settings. This paper investigates efficient CoT transfer across models of different scales and architectures through an adaptive reasoning summarization framework. The proposed method compresses reasoning traces via semantic segmentation with importance scoring, budget-aware dynamic compression, and coherence reconstruction, preserving critical reasoning steps while significantly reducing token usage. Experiments on 7{,}501 medical examination questions across 10 specialties show up to 40% higher accuracy than truncation under the same token budgets. Evaluations on 64 model pairs from eight LLMs (1.5B-32B parameters, including DeepSeek-R1 and Qwen3) confirm strong cross-model transferability. Furthermore, a Gaussian Process-based Bayesian optimization module reduces evaluation cost by 84% and reveals a power-law relationship between model size and cross-domain robustness. These results demonstrate that reasoning summarization provides a practical path toward efficient CoT transfer, enabling advanced reasoning under tight computational constraints. Code will be released upon publication.