🤖 AI Summary
Existing summarization methods struggle to achieve precise, single-dimension control across multiple quality aspects such as completeness, conciseness, and faithfulness. This work proposes a novel ranking-based loss function that leverages fine-grained evaluation models like FineSurE to guide the generation process, enabling explicit controllability over specific quality dimensions without compromising overall performance. The approach is compatible with mainstream large language models—including LLaMA, Qwen, and Mistral—and maintains state-of-the-art summary quality while significantly enhancing the ability to modulate individual quality attributes.
📝 Abstract
Recent advances in summarization research focus on improving summary quality across multiple criteria, such as completeness, conciseness, and faithfulness, by jointly optimizing these dimensions. However, these efforts largely overlook the challenge of controlling summary generation with respect to individual criteria, especially in the presence of their inherent trade-offs. For example, enhancing conciseness can compromise completeness, and vice versa. In this work, we address this gap by proposing a loss function that aligns model outputs with fine-grained, model-based evaluation scores (e.g., from FineSurE), enabling both improvement in summary quality and dimension-specific control. Our approach improves the overall quality of summaries while maintaining the ability to selectively prioritize one criterion over others. Experiments on three pretrained models (LLaMA, Qwen, and Mistral) demonstrate that our method achieves performance comparable to state-of-the-art summarizers, while uniquely offering strong controllability over individual quality dimensions.