Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark

📅 2025-01-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Relation extraction models often suffer from shortcut learning due to spurious correlations between entity mentions and relations, leading to severe generalization degradation—especially on data exhibiting entity bias. To address this, we propose DREB, a controllable debiasing evaluation benchmark tailored for entity bias: it breaks spurious correlations via controlled entity replacement. We further introduce MixDebias, a hybrid debiasing method that jointly employs data reweighting and adversarial training to decouple bias at both the data and training levels. DREB supports quantitative bias measurement (Bias Evaluator) and naturalness assessment of generated instances (PPL Evaluator). Experiments demonstrate that MixDebias significantly improves out-of-distribution generalization on DREB while preserving in-distribution accuracy on original datasets. The code and benchmark will be publicly released.

Technology Category

Application Category

📝 Abstract
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions. However, biases within datasets can lead models to learn shortcut patterns, resulting in inaccurate assessments and hindering real-world applicability. This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context. We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement. DREB utilizes Bias Evaluator and PPL Evaluator to ensure low bias and high naturalness, providing a reliable and accurate assessment of model generalization in entity bias scenarios. To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques. MixDebias effectively improves model performance on DREB while maintaining performance on the original dataset. Extensive experiments demonstrate the effectiveness and robustness of MixDebias compared to existing methods, highlighting its potential for improving the generalization ability of relation extraction models. We will release DREB and MixDebias publicly.
Problem

Research questions and friction points this paper is trying to address.

Machine Learning Bias
Entity Bias
Contextual Information
Innovation

Methods, ideas, or system contributions that make the work stand out.

DREB
MixDebias
Bias Mitigation
🔎 Similar Papers
No similar papers found.
L
Liang He
National Key Laboratory for Novel Software Technology, Nanjing University
Y
Yougang Chu
National Key Laboratory for Novel Software Technology, Nanjing University
Z
Zhen Wu
National Key Laboratory for Novel Software Technology, Nanjing University
Jianbing Zhang
Jianbing Zhang
Associate Professor, Nanjing University
pre-training modelmulti-modalimage captioningnatural language processingdata mining
Xinyu Dai
Xinyu Dai
Nanjing University
J
Jiajun Chen
National Key Laboratory for Novel Software Technology, Nanjing University