🤖 AI Summary
To address the weak zero-shot/low-resource generalization and poor interpretability of generative large language models (LLMs) in named entity recognition (NER)—stemming from implicit reasoning—this paper proposes the first end-to-end explicit reasoning framework tailored for NER. Our method transforms conventional pattern matching into a verifiable chain-of-thought (CoT) reasoning process, integrating instruction tuning with a multi-dimensional reward-based reasoning optimization mechanism. The framework is trained in three stages to substantially enhance model cognitive transparency and robustness. Experiments demonstrate that our approach achieves an F1 score of 72.4% in zero-shot settings—outperforming GPT-4 by 12.3 percentage points—and establishes the new state of the art. The core contribution lies in introducing, for the first time in NER, a verifiable and optimizable explicit reasoning paradigm grounded in interpretable, stepwise inference.
📝 Abstract
Generative LLMs typically improve Named Entity Recognition (NER) performance through instruction tuning. They excel at generating entities by semantic pattern matching but lack an explicit, verifiable reasoning mechanism. This "cognitive shortcutting" leads to suboptimal performance and brittle generalization, especially in zero-shot and lowresource scenarios where reasoning from limited contextual cues is crucial. To address this issue, a reasoning framework is proposed for NER, which shifts the extraction paradigm from implicit pattern matching to explicit reasoning. This framework consists of three stages: Chain of Thought (CoT) generation, CoT tuning, and reasoning enhancement. First, a dataset annotated with NER-oriented CoTs is generated, which contain task-relevant reasoning chains. Then, they are used to tune the NER model to generate coherent rationales before deriving the final answer. Finally, a reasoning enhancement stage is implemented to optimize the reasoning process using a comprehensive reward signal. This stage ensures explicit and verifiable extractions. Experiments show that ReasoningNER demonstrates impressive cognitive ability in the NER task, achieving competitive performance. In zero-shot settings, it achieves state-of-the-art (SOTA) performance, outperforming GPT-4 by 12.3 percentage points on the F1 score. Analytical results also demonstrate its great potential to advance research in reasoningoriented information extraction. Our codes are available at https://github.com/HuiResearch/ReasoningIE.