🤖 AI Summary
To address incomplete utterance rewriting (IUR) in dialogues—caused by anaphora and ellipsis—existing methods predominantly rely on context-aware direct generation, overlooking the intrinsic structural decomposition into coreference resolution and ellipsis recovery. This paper proposes an edit-operation-driven two-stage IUR paradigm: Stage 1 predicts fine-grained edit actions (e.g., replace, insert, restore), and Stage 2 performs action- and context-conditioned rewriting. We introduce adversarial perturbation training to mitigate exposure bias and error propagation, and design a context-aware edit representation mechanism. Evaluated on three standard IUR benchmarks, our approach achieves significant improvements over state-of-the-art methods in both rewriting accuracy and BLEU score, demonstrating the effectiveness and robustness of the edit-based modeling paradigm.
📝 Abstract
Previous work on Incomplete Utterance Rewriting (IUR) has primarily focused on generating rewritten utterances based solely on dialogue context, ignoring the widespread phenomenon of coreference and ellipsis in dialogues. To address this issue, we propose a novel framework called TEO (emph{Two-stage approach on Editing Operation}) for IUR, in which the first stage generates editing operations and the second stage rewrites incomplete utterances utilizing the generated editing operations and the dialogue context. Furthermore, an adversarial perturbation strategy is proposed to mitigate cascading errors and exposure bias caused by the inconsistency between training and inference in the second stage. Experimental results on three IUR datasets show that our TEO outperforms the SOTA models significantly.