🤖 AI Summary
To address the inefficiency and fine-tuning dependency of entity mention detection in large language models (LLMs), this paper proposes ToMMeR—a lightweight probe model (<300K parameters) leveraging early-layer hidden representations of LLMs. Methodologically, ToMMeR exploits the emergent, structured entity representations naturally present in early Transformer layers, enabling zero-shot, high-recall mention boundary identification. It integrates a span classification head with the LLM’s intrinsic discriminative capability to filter and calibrate candidate mentions. Evaluated across 13 NER benchmarks, ToMMeR achieves 93% zero-shot recall; after LLM-based discrimination, precision exceeds 90%. With an extended classification head, F1 scores reach 80–87%, approaching state-of-the-art performance. This work is the first systematic demonstration that early LLM layers encode transferable, trainable-free mention detection capacity—establishing a new paradigm for efficient, training-free entity recognition.
📝 Abstract
Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93% recall zero-shot, with over 90% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.