🤖 AI Summary
To address the high computational cost and negative sample explosion in span-level metric learning for few-shot named entity recognition (NER), this paper proposes a two-stage decoupled framework: first detecting candidate entity spans, then classifying their types. Methodologically, we design a hybrid multi-stage decoder, incorporate an entity-oriented contrastive learning module to enhance semantic discriminability, and jointly leverage KNN retrieval and classification for robust inference. The entire framework is trained end-to-end via meta-learning and fine-tuned lightly on support sets. Compared with prevailing token-level or span-level approaches, our method significantly alleviates negative sample redundancy and computational overhead. On the FewNERD benchmark, it achieves state-of-the-art performance, particularly demonstrating superior generalization to unseen entity types.
📝 Abstract
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. Previous methods employing token-level or span-level metric learning suffer from the computational burden and a large number of negative sample spans. In this paper, we propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER), which splits the general NER into two stages: entity-span detection and entity classification. There are 3 processes for introducing MsFNER: training, finetuning, and inference. In the training process, we train and get the best entity-span detection model and the entity classification model separately on the source domain using meta-learning, where we create a contrastive learning module to enhance entity representations for entity classification. During finetuning, we finetune the both models on the support dataset of target domain. In the inference process, for the unlabeled data, we first detect the entity-spans, then the entity-spans are jointly determined by the entity classification model and the KNN. We conduct experiments on the open FewNERD dataset and the results demonstrate the advance of MsFNER.