Short Chains, Deep Thoughts: Balancing Reasoning Efficiency and Intra-Segment Capability via Split-Merge Optimization

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the high latency and computational overhead of large reasoning models caused by structural redundancy in generating long reasoning chains. To mitigate this, the authors propose CoSMo, a novel framework that integrates consistency-guided dynamic merge-and-split optimization, structure-aligned reinforcement learning, and segment-level computational budget constraints to streamline reasoning chains while preserving logical completeness. Experimental results demonstrate that CoSMo consistently improves accuracy by an average of 3.3 percentage points across multiple benchmarks and backbone models, while reducing the number of reasoning segments used by 28.7%, thereby achieving a significant balance between efficiency and expressive reasoning capability.

Technology Category

Application Category

📝 Abstract

While Large Reasoning Models (LRMs) have demonstrated impressive capabilities in solving complex tasks through the generation of long reasoning chains, this reliance on verbose generation results in significant latency and computational overhead. To address these challenges, we propose \textbf{CoSMo} (\textbf{Co}nsistency-Guided \textbf{S}plit-\textbf{M}erge \textbf{O}ptimization), a framework designed to eliminate structural redundancy rather than indiscriminately restricting token volume. Specifically, CoSMo utilizes a split-merge algorithm that dynamically refines reasoning chains by merging redundant segments and splitting logical gaps to ensure coherence. We then employ structure-aligned reinforcement learning with a novel segment-level budget to supervise the model in maintaining efficient reasoning structures throughout training. Extensive experiments across multiple benchmarks and backbones demonstrate that CoSMo achieves superior performance, improving accuracy by \textbf{3.3} points while reducing segment usage by \textbf{28.7\%} on average compared to reasoning efficiency baselines.

Problem

Research questions and friction points this paper is trying to address.

Large Reasoning Models

reasoning efficiency

computational overhead

reasoning chains

latency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Split-Merge Optimization

Reasoning Efficiency

Structural Redundancy