π€ AI Summary
This paper addresses the multi-label classification challenge in implicit discourse relation recognition (IDRR), arising from semantic ambiguity. We introduce the first multi-label IDRR benchmark aligned with PDTB 3.0βs three-layer semantic hierarchy, along with a unified single- and multi-label joint learning framework. Methodologically, we propose a context encoderβbased multi-task deep classifier that jointly optimizes multi-label classification loss (e.g., binary cross-entropy) and single-label cross-entropy loss, trained exclusively on the DiscoGeM corpus. Our contributions are threefold: (1) establishing the first multi-label IDRR benchmark; (2) proposing a joint single-/multi-label learning paradigm that naturally derives optimal single-label predictions from multi-label outputs; and (3) demonstrating, for the first time, effective cross-corpus transfer from DiscoGeM to PDTB 3.0. Experiments show our approach achieves state-of-the-art performance on the DiscoGeM single-label IDRR task.
π Abstract
We address the inherent ambiguity in Implicit Discourse Relation Recognition (IDRR) by introducing a novel multi-task classification model capable of learning both multi-label and single-label representations of discourse relations. Our model is trained exclusively on the DiscoGeM corpus and evaluated both on the DiscoGeM and the PDTB 3.0 corpus. We establish the first benchmark on multi-label IDRR classification and achieve SOTA results on single-label IDRR classification using the DiscoGeM corpus. Finally, we present the first evaluation on the potential of transfer learning between the DiscoGeM and the PDTB 3.0 corpus on single-label IDRR classification.