DeDisCo at the DISRPT 2025 Shared Task: A System for Discourse Relation Classification

📅 2025-09-14

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This work addresses cross-lingual discourse relation classification in the DISRPT 2025 shared task, where limited annotated data and language-specific annotation biases hinder robust generalization. Method: We propose a dual-path encoder–decoder architecture: mT5 serves as the multilingual encoder to extract semantic representations, while Qwen acts as the reasoning-aware decoder for relation inference. We integrate machine-translated augmented data with explicit linguistic features—including dependency paths and discourse connectives—and introduce a progressive fine-tuning strategy tailored for low-resource settings. Contribution/Results: The model achieves 71.28% macro-accuracy on the DISRPT 2025 test set—ranking among the top systems—demonstrating strong generalization under unsupervised and few-shot conditions. Error analysis confirms its effectiveness in mitigating cross-lingual annotation bias and improving long-distance dependency modeling, establishing a scalable paradigm for low-resource discourse parsing.

Technology Category

Application Category

📝 Abstract

This paper presents DeDisCo, Georgetown University's entry in the DISRPT 2025 shared task on discourse relation classification. We test two approaches, using an mt5-based encoder and a decoder based approach using the openly available Qwen model. We also experiment on training with augmented dataset for low-resource languages using matched data translated automatically from English, as well as using some additional linguistic features inspired by entries in previous editions of the Shared Task. Our system achieves a macro-accuracy score of 71.28, and we provide some interpretation and error analysis for our results.

Problem

Research questions and friction points this paper is trying to address.

Classifying discourse relations in multilingual text

Improving performance for low-resource language processing

Testing transformer-based approaches for relation classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

mt5-based encoder for classification

Qwen decoder approach for relations

augmented dataset with automatic translation

🔎 Similar Papers

A Multi-Task and Multi-Label Classification Model for Implicit Discourse Relation Recognition