TriG-NER: Triplet-Grid Framework for Discontinuous Named Entity Recognition

📅 2024-11-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Discontinuous named entity recognition (DNER) poses significant challenges in modeling entities that span non-contiguous text segments and exhibit discontinuity. Method: This paper proposes a label-schema-agnostic word-pair relational modeling paradigm: we construct a triplet grid to encode intra- and inter-entity token-level similarities, and introduce a novel token-level triplet loss for end-to-end optimization. Our approach decouples entity boundary detection from rigid labeling-format constraints, enabling fine-grained and robust representation learning. Contribution/Results: Evaluated on three mainstream DNER benchmarks, our method consistently outperforms existing grid-based approaches, achieving substantial improvements in recognizing cross-segment and nested entities. Moreover, it natively supports diverse annotation schemas—demonstrating strong generalizability and superior structural modeling capability.

Technology Category

Application Category

📝 Abstract
Discontinuous Named Entity Recognition (DNER) presents a challenging problem where entities may be scattered across multiple non-adjacent tokens, making traditional sequence labelling approaches inadequate. Existing methods predominantly rely on custom tagging schemes to handle these discontinuous entities, resulting in models tightly coupled to specific tagging strategies and lacking generalisability across diverse datasets. To address these challenges, we propose TriG-NER, a novel Triplet-Grid Framework that introduces a generalisable approach to learning robust token-level representations for discontinuous entity extraction. Our framework applies triplet loss at the token level, where similarity is defined by word pairs existing within the same entity, effectively pulling together similar and pushing apart dissimilar ones. This approach enhances entity boundary detection and reduces the dependency on specific tagging schemes by focusing on word-pair relationships within a flexible grid structure. We evaluate TriG-NER on three benchmark DNER datasets and demonstrate significant improvements over existing grid-based architectures. These results underscore our framework's effectiveness in capturing complex entity structures and its adaptability to various tagging schemes, setting a new benchmark for discontinuous entity extraction.
Problem

Research questions and friction points this paper is trying to address.

Discontinuous Named Entity Recognition
DNER
Challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

TriG-NER
Triplet Loss
Discontinuous Entity Recognition
R
R. Cabral
The University of Sydney, Sydney, NSW, Australia
S
S. Han
The University of Melbourne, Melbourne, VIC, Australia
A
Areej Alhassan
The University of Manchester, Manchester, England, UK
R
R. Batista-Navarro
The University of Manchester, Manchester, England, UK
Goran Nenadic
Goran Nenadic
Department of Computer Science, University of Manchester
Natural language processingtext mininghealth informatics
Josiah Poon
Josiah Poon
University of Sydney
machine learningnatural language processingtext mininghealth informatics