Linguistic Blind Spots in Clinical Decision Extraction

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

This study addresses the significant performance degradation of clinical decision extraction in narrative-rich recommendation and prevention texts, revealing a systematic blind spot. Leveraging the DICTUM classification framework and MedDec discharge summaries, the work uncovers, for the first time, associations between clinical decision categories and specific linguistic features—such as stopword density, negation, and hedging cues—and proposes a Transformer-based extraction model evaluated under both exact and overlap matching strategies. Experimental results demonstrate that traditional exact-match recall is only 48%, rising to 71% under overlap matching. Notably, segments containing high stopword ratios or negation/hedging markers exhibit recall as low as 24%, highlighting how conventional evaluation metrics systematically undervalue semantically correct but boundary-ambiguous predictions.

Technology Category

Application Category

📝 Abstract

Extracting medical decisions from clinical notes is a key step for clinical decision support and patient-facing care summaries. We study how the linguistic characteristics of clinical decisions vary across decision categories and whether these differences explain extraction failures. Using MedDec discharge summaries annotated with decision categories from the Decision Identification and Classification Taxonomy for Use in Medicine (DICTUM), we compute seven linguistic indices for each decision span and analyze span-level extraction recall of a standard transformer model. We find clear category-specific signatures: drug-related and problem-defining decisions are entity-dense and telegraphic, whereas advice and precaution decisions contain more narrative, with higher stopword and pronoun proportions and more frequent hedging and negation cues. On the validation split, exact-match recall is 48%, with large gaps across linguistic strata: recall drops from 58% to 24% from the lowest to highest stopword-proportion bins, and spans containing hedging or negation cues are less likely to be recovered. Under a relaxed overlap-based match criterion, recall increases to 71%, indicating that many errors are span boundary disagreements rather than complete misses. Overall, narrative-style spans--common in advice and precaution decisions--are a consistent blind spot under exact matching, suggesting that downstream systems should incorporate boundary-tolerant evaluation and extraction strategies for clinical decisions.

Problem

Research questions and friction points this paper is trying to address.

clinical decision extraction

linguistic variation

narrative style

span boundary

recall gap

Innovation

Methods, ideas, or system contributions that make the work stand out.

clinical decision extraction

linguistic analysis

transformer model