Synthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition

📅 2025-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the sharp performance degradation of implicit discourse relation recognition (IDRR) models under zero- or few-shot cross-domain settings. It presents the first systematic investigation into the efficacy of large language model (LLM)-generated synthetic data for domain adaptation. The proposed method employs LLM-based conditional continuation to synthesize discourse pairs—exhibiting target implicit semantic relations—directly from unlabeled target-domain texts, aiming to enhance the cross-domain generalization of source-trained supervised models. Extensive experiments on a large-scale cross-domain benchmark reveal that none of the evaluated synthetic strategies yield statistically significant improvements, exposing a pervasive failure mode: current LLM-generated samples inadequately support IDRR domain adaptation. Key contributions include: (1) establishing rigorous statistical significance and comparability standards for cross-domain IDRR evaluation; and (2) providing the first empirical analysis of the fundamental limitations of LLM-synthesized data in IDRR, thereby setting a critical baseline and cautionary insight for future research.

Technology Category

Application Category

📝 Abstract
Implicit discourse relation recognition (IDRR) -- the task of identifying the implicit coherence relation between two text spans -- requires deep semantic understanding. Recent studies have shown that zero- or few-shot approaches significantly lag behind supervised models, but LLMs may be useful for synthetic data augmentation, where LLMs generate a second argument following a specified coherence relation. We applied this approach in a cross-domain setting, generating discourse continuations using unlabelled target-domain data to adapt a base model which was trained on source-domain labelled data. Evaluations conducted on a large-scale test set revealed that different variations of the approach did not result in any significant improvements. We conclude that LLMs often fail to generate useful samples for IDRR, and emphasize the importance of considering both statistical significance and comparability when evaluating IDRR models.
Problem

Research questions and friction points this paper is trying to address.

Improving cross-domain implicit discourse relation recognition
Exploring synthetic data augmentation using LLMs
Evaluating effectiveness of LLM-generated samples for IDRR
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-generated synthetic data augmentation
Cross-domain adaptation with unlabelled data
Evaluating statistical significance in IDRR