Enhancing Long Document Long Form Summarisation with Self-Planning

📅 2025-12-18

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

To address factual inconsistency, information loss, and poor traceability in long-document summarization, this paper proposes a sentence-level highlighting-guided self-planning generation framework. First, it identifies salient sentences via importance modeling and generates a traceable content plan; subsequently, summary generation is conditioned on this plan, effectively decoupling content selection from surface realization. This novel paradigm significantly enhances summary faithfulness and fine-grained detail retention. On the GovReport benchmark, our approach achieves a +4.1-point improvement in ROUGE-L and a 35% gain in SummaC score. Qualitative analysis confirms more complete preservation of critical details, as well as improved cross-domain accuracy and analytical depth in generated summaries.

Technology Category

Application Category

📝 Abstract

We introduce a novel approach for long context summarisation, highlight-guided generation, that leverages sentence-level information as a content plan to improve the traceability and faithfulness of generated summaries. Our framework applies self-planning methods to identify important content and then generates a summary conditioned on the plan. We explore both an end-to-end and two-stage variants of the approach, finding that the two-stage pipeline performs better on long and information-dense documents. Experiments on long-form summarisation datasets demonstrate that our method consistently improves factual consistency while preserving relevance and overall quality. On GovReport, our best approach has improved ROUGE-L by 4.1 points and achieves about 35% gains in SummaC scores. Qualitative analysis shows that highlight-guided summarisation helps preserve important details, leading to more accurate and insightful summaries across domains.

Problem

Research questions and friction points this paper is trying to address.

Improves factual consistency in long document summarization.

Enhances traceability and faithfulness of generated summaries.

Preserves important details for accurate and insightful summaries.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-planning identifies important content for summaries

Highlight-guided generation uses sentence-level information as plan

Two-stage pipeline improves factual consistency in long documents

🔎 Similar Papers

A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods