🤖 AI Summary
Aspect-based summarization of long texts—such as books—suffers from a scarcity of high-quality reference summaries, prohibitively high costs of human evaluation, and poor scalability. Method: This paper proposes BookAsSumQA, the first automated evaluation framework for aspect-based summarization of long literary texts. It leverages narrative knowledge graphs to automatically generate aspect-specific question-answer (QA) pairs, eliminating the need for manually annotated reference summaries. By integrating large language models (LLMs) with retrieval-augmented generation (RAG), it uses QA accuracy as a proxy metric for summary quality. Contribution/Results: Experiments demonstrate that BookAsSumQA effectively discriminates among diverse summarization methods. Notably, it is the first to empirically reveal RAG’s significant superiority over LLM-only approaches in long-document aspect-based summarization. The framework is both scalable and practically applicable, enabling efficient, reference-free evaluation.
📝 Abstract
Aspect-based summarization aims to generate summaries that highlight specific aspects of a text, enabling more personalized and targeted summaries. However, its application to books remains unexplored due to the difficulty of constructing reference summaries for long text. To address this challenge, we propose BookAsSumQA, a QA-based evaluation framework for aspect-based book summarization. BookAsSumQA automatically generates aspect-specific QA pairs from a narrative knowledge graph to evaluate summary quality based on its question-answering performance. Our experiments using BookAsSumQA revealed that while LLM-based approaches showed higher accuracy on shorter texts, RAG-based methods become more effective as document length increases, making them more efficient and practical for aspect-based book summarization.