๐ค AI Summary
Multidimensional summary refinement faces challenges including difficulty in balancing multiple dimensions and inefficient utilization of feedback. This paper proposes the first end-to-end refinement framework based on reflective reasoning, leveraging Chain-of-Thought (CoT) reasoning over extended sequences (Long-CoT) to model couplings among multidimensional feedback signals, thereby jointly optimizing coherence, factual consistency, and conciseness. Key contributions include: (1) a novel reflective reasoning mechanism that explicitly models the interplay among feedback exposure volume, dimension count, and reasoning strategy; (2) SumFeed-CoTโthe first Long-CoTโbased dataset for reflective reasoning optimization in summarization feedback; and (3) a lightweight architecture with a noise-robust, structured feedback fusion method. Experiments show that our model achieves state-of-the-art performance despite a 40% reduction in parameter count, yielding ROUGE improvements of 3.2โ5.8 points across multiple benchmarks, while demonstrating strong robustness to feedback noise and ordering variations.
๐ Abstract
Summarization refinement faces challenges when extending to multi-dimension. In this paper, we introduce ReFeed, a powerful summarization refinement pipeline that enhances multiple dimensions through reflective reasoning on feedback. To achieve this, we release SumFeed-CoT, a large-scale Long-CoT-based dataset optimized for training a lightweight model with reflective reasoning. Our experiments reveal how the number of dimensions, feedback exposure, and reasoning policy influence refinement performance, highlighting reflective reasoning and simultaneously addressing multiple feedback is crucial to mitigate trade-off between dimensions. Furthermore, ReFeed is robust to noisy feedback and feedback order. Lastly, our finding emphasizes that creating data with a proper goal and guideline constitutes a fundamental pillar of effective reasoning. The dataset and model will be released.