🤖 AI Summary
This paper addresses the challenge of simultaneously preserving semantic fidelity and enhancing readability in long-document simplification. We propose the first stage-wise, collaborative, and progressive simplification framework tailored for large language models (LLMs). Methodologically, we introduce a novel three-level simplification paradigm—“document–topic–word”—and systematically design multi-stage prompt coordination, hierarchical task decomposition, consistency-constrained generation, and cross-granularity control mechanisms, explicitly distinguishing simplification from summarization. Compared to conventional word- or sentence-level approaches and single-prompt LLM methods, our framework achieves significant improvements in simplification quality across multiple benchmarks. It maintains global coherence and semantic faithfulness while enabling controlled dimensionality reduction at structural, syntactic, and lexical levels—establishing new state-of-the-art performance.
📝 Abstract
Research on text simplification has primarily focused on lexical and sentence-level changes. Long document-level simplification (DS) is still relatively unexplored. Large Language Models (LLMs), like ChatGPT, have excelled in many natural language processing tasks. However, their performance on DS tasks is unsatisfactory, as they often treat DS as merely document summarization. For the DS task, the generated long sequences not only must maintain consistency with the original document throughout, but complete moderate simplification operations encompassing discourses, sentences, and word-level simplifications. Human editors employ a hierarchical complexity simplification strategy to simplify documents. This study delves into simulating this strategy through the utilization of a multi-stage collaboration using LLMs. We propose a progressive simplification method (ProgDS) by hierarchically decomposing the task, including the discourse-level, topic-level, and lexical-level simplification. Experimental results demonstrate that ProgDS significantly outperforms existing smaller models or direct prompting with LLMs, advancing the state-of-the-art in the document simplification task.