๐ค AI Summary
Existing automatic summarization methods for scientific documents lack explicit modeling of explanatory content, resulting in imbalanced explanation ratios in generated summaries and diminished comprehensibility. To address this, we propose a discourse-structure-driven two-stage planning framework tailored for science communication summarization. Our approach decouples planning from generation by treating discourse planning either as an input condition or as a prefix to the outputโmarking the first such synergistic, disentangled design. It integrates the PDTB/DRS discourse analysis framework, prompt-augmented sequence generation, controllable decoding, and multi-stage planning. Evaluated on three science communication summarization datasets, our method consistently outperforms state-of-the-art baselines. Both automatic and human evaluations confirm significant improvements in explanatory appropriateness, factual consistency, and controllability, alongside effective hallucination suppression.
๐ Abstract
Lay summaries for scientific documents typically include explanations to help readers grasp sophisticated concepts or arguments. However, current automatic summarization methods do not explicitly model explanations, which makes it difficult to align the proportion of explanatory content with human-written summaries. In this paper, we present a plan-based approach that leverages discourse frameworks to organize summary generation and guide explanatory sentences by prompting responses to the plan. Specifically, we propose two discourse-driven planning strategies, where the plan is conditioned as part of the input or part of the output prefix, respectively. Empirical experiments on three lay summarization datasets show that our approach outperforms existing state-of-the-art methods in terms of summary quality, and it enhances model robustness, controllability, and mitigates hallucination.