DeDisCo at the DISRPT 2025 Shared Task: A System for Discourse Relation Classification

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses cross-lingual discourse relation classification in the DISRPT 2025 shared task, where limited annotated data and language-specific annotation biases hinder robust generalization. Method: We propose a dual-path encoder–decoder architecture: mT5 serves as the multilingual encoder to extract semantic representations, while Qwen acts as the reasoning-aware decoder for relation inference. We integrate machine-translated augmented data with explicit linguistic features—including dependency paths and discourse connectives—and introduce a progressive fine-tuning strategy tailored for low-resource settings. Contribution/Results: The model achieves 71.28% macro-accuracy on the DISRPT 2025 test set—ranking among the top systems—demonstrating strong generalization under unsupervised and few-shot conditions. Error analysis confirms its effectiveness in mitigating cross-lingual annotation bias and improving long-distance dependency modeling, establishing a scalable paradigm for low-resource discourse parsing.

Technology Category

Application Category

📝 Abstract
This paper presents DeDisCo, Georgetown University's entry in the DISRPT 2025 shared task on discourse relation classification. We test two approaches, using an mt5-based encoder and a decoder based approach using the openly available Qwen model. We also experiment on training with augmented dataset for low-resource languages using matched data translated automatically from English, as well as using some additional linguistic features inspired by entries in previous editions of the Shared Task. Our system achieves a macro-accuracy score of 71.28, and we provide some interpretation and error analysis for our results.
Problem

Research questions and friction points this paper is trying to address.

Classifying discourse relations in multilingual text
Improving performance for low-resource language processing
Testing transformer-based approaches for relation classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

mt5-based encoder for classification
Qwen decoder approach for relations
augmented dataset with automatic translation
🔎 Similar Papers
No similar papers found.
Z
Zhuoxuan Ju
Georgetown University
J
Jingni Wu
Georgetown University
A
Abhishek Purushothama
Georgetown University
Amir Zeldes
Amir Zeldes
Associate Professor of Computational Linguistics, Georgetown University
corpus linguisticscomputational linguisticsNLPdiscoursedigital humanities