Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

📅 2025-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor generalization and high adaptation cost of large language models (LLMs) in cross-domain named entity recognition (NER), this paper proposes SaM, a training-free, inference-time dynamic adaptation framework. SaM dynamically selects the most suitable domain-specific expert model in real time—based on domain similarity and instance-level performance estimation—and constructs a task-specific model via lightweight parameter fusion. Its core innovation lies in the first realization of a zero-training, plug-and-play dynamic model fusion mechanism that supports flexible addition or removal of expert models. Evaluated on multiple cross-domain NER benchmarks, SaM achieves an average 10% improvement over unified LLM baselines, demonstrating substantial gains in both generalization capability and deployment efficiency.

Technology Category

Application Category

📝 Abstract
Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework's effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.
Problem

Research questions and friction points this paper is trying to address.

Costly annotation and training for domain-specific NER models
Lack of adaptation and scalability in unified multi-domain NER models
Dynamic expert selection and merging for improved NER generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic expert model selection based on domain similarity
Merging experts for task-specific model optimization
Scalable framework allowing easy expert addition/removal
🔎 Similar Papers
No similar papers found.
Z
Zhuojun Ding
School of Computer Science & Technology, Huazhong University of Science and Technology
W
Wei Wei
School of Computer Science & Technology, Huazhong University of Science and Technology
Chenghao Fan
Chenghao Fan
School of Comp. Sci., Huazhong University of Science and Technology, Wuhan, China
Natural Language ProcessingLLM