Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition.

📅 2025-03-07

🏛️ AMIA ... Annual Symposium proceedings. AMIA Symposium

📈 Citations: 1

✨ Influential: 0

career value

130K/year

🤖 AI Summary

This paper addresses the underutilization of domain-specific semantic knowledge in clinical named entity recognition (NER). We propose a novel modeling approach that explicitly incorporates semantic type dependencies from the Unified Medical Language System (UMLS) to enhance contextual understanding. Specifically, we design a single-pass matrix encoding mechanism that represents multi-type semantic relations—e.g., “Disease–Symptom” or “Drug–Dosage”—between entity spans and context tokens as structured matrices, seamlessly integrated into a BiLSTM-GCN-CRF architecture. The method is compatible with clinical pretrained embeddings (e.g., BERT, BioBERT, UMLSBERT) without requiring additional fine-tuning. Evaluated on standard clinical benchmarks—including i2b2 and ShARe/CLEF—it achieves significant improvements in fine-grained NER performance, with average F1-score gains of 1.8–3.2 percentage points. Results demonstrate that explicit modeling of domain-specific semantic dependencies substantially enhances clinical NER accuracy and advances knowledge-guided NER paradigms.

Technology Category

Application Category

📝 Abstract

Previous work on clinical relation extraction from free-text sentences leveraged information about semantic types from clinical knowledge bases as a part of entity representations. In this paper, we exploit additional evidence by also making use of domain-specific semantic type dependencies. We encode the relation between a span of tokens matching a Unified Medical Language System (UMLS) concept and other tokens in the sentence. We implement our method and compare against different named entity recognition (NER) architectures (i.e., BiLSTM-CRF and BiLSTM-GCN-CRF) using different pre-trained clinical embeddings (i.e., BERT, BioBERT, UMLSBert). Our experimental results on clinical datasets show that in some cases NER effectiveness can be significantly improved by making use of domain-specific semantic type dependencies. Our work is also the first study generating a matrix encoding to make use of more than three dependencies in one pass for the NER task.

Problem

Research questions and friction points this paper is trying to address.

Improving clinical named entity recognition using semantic type dependencies.

Encoding UMLS concept relations for better NER performance.

First study to use matrix encoding for multiple dependencies in NER.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses domain-specific semantic type dependencies

Encodes UMLS concept token relations

Generates matrix encoding for multiple dependencies

🔎 Similar Papers

Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework