🤖 AI Summary
In relation extraction (RE), small encoder-based models still outperform decoder-only LLMs, yet their predicate representation capability is constrained by static linear mappings. This paper proposes a novel dual-encoder architecture: a primary encoder processes the sentence, while an auxiliary encoder dynamically generates instance-adaptive predicate representations fused with entity spans—achieving, for the first time, entity-aware predicate instantiation. The method jointly optimizes contrastive learning and cross-entropy loss to enhance inter-class discriminability. Lightweight fine-tuning on BiomedBERT/RoBERTa yields consistent F1 improvements of 1–2% across four standard benchmarks—including two biomedical datasets—significantly surpassing current state-of-the-art methods. Key contributions include: (1) an instance-adaptive predicate encoding mechanism; (2) a cooperative dual-encoder paradigm; and (3) an efficient, transferable lightweight fine-tuning strategy.
📝 Abstract
Relation extraction (RE) is a standard information extraction task playing a major role in downstream applications such as knowledge discovery and question answering. Although decoder-only large language models are excelling in generative tasks, smaller encoder models are still the go to architecture for RE. In this paper, we revisit fine-tuning such smaller models using a novel dual-encoder architecture with a joint contrastive and cross-entropy loss. Unlike previous methods that employ a fixed linear layer for predicate representations, our approach uses a second encoder to compute instance-specific predicate representations by infusing them with real entity spans from corresponding input instances. We conducted experiments on two biomedical RE datasets and two general domain datasets. Our approach achieved F1 score improvements ranging from 1% to 2% over state-of-the-art methods with a simple but elegant formulation. Ablation studies justify the importance of various components built into the proposed architecture.