Improving Drug Identification in Overdose Death Surveillance using Large Language Models

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Drug overdose deaths in the United States—particularly those involving fentanyl—are rising steadily, yet critical cause-of-death information remains buried in unstructured autopsy reports; ICD-10 coding is delayed and prone to misclassification. To address this, we propose a domain-adapted language model approach—fine-tuning BioClinicalBERT—for multi-label drug identification, enabling precise extraction of substances involved from free-text autopsy reports. Unlike conventional machine learning methods or general-purpose large language models, our method demonstrates unprecedented robustness in cross-year external validation, achieving a macro-F1 score of 0.966 (internal test ≥ 0.998), substantially outperforming existing techniques. This advancement enables near real-time surveillance of illicit drug trends, delivering timely, high-fidelity data to inform public health interventions with minimal information loss.

Technology Category

Application Category

📝 Abstract

The rising rate of drug-related deaths in the United States, largely driven by fentanyl, requires timely and accurate surveillance. However, critical overdose data are often buried in free-text coroner reports, leading to delays and information loss when coded into ICD (International Classification of Disease)-10 classifications. Natural language processing (NLP) models may automate and enhance overdose surveillance, but prior applications have been limited. A dataset of 35,433 death records from multiple U.S. jurisdictions in 2020 was used for model training and internal testing. External validation was conducted using a novel separate dataset of 3,335 records from 2023-2024. Multiple NLP approaches were evaluated for classifying specific drug involvement from unstructured death certificate text. These included traditional single- and multi-label classifiers, as well as fine-tuned encoder-only language models such as Bidirectional Encoder Representations from Transformers (BERT) and BioClinicalBERT, and contemporary decoder-only large language models such as Qwen 3 and Llama 3. Model performance was assessed using macro-averaged F1 scores, and 95% confidence intervals were calculated to quantify uncertainty. Fine-tuned BioClinicalBERT models achieved near-perfect performance, with macro F1 scores >=0.998 on the internal test set. External validation confirmed robustness (macro F1=0.966), outperforming conventional machine learning, general-domain BERT models, and various decoder-only large language models. NLP models, particularly fine-tuned clinical variants like BioClinicalBERT, offer a highly accurate and scalable solution for overdose death classification from free-text reports. These methods can significantly accelerate surveillance workflows, overcoming the limitations of manual ICD-10 coding and supporting near real-time detection of emerging substance use trends.

Problem

Research questions and friction points this paper is trying to address.

Automate drug identification in overdose death reports

Improve accuracy of overdose surveillance using NLP

Overcome limitations of manual ICD-10 coding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned BioClinicalBERT for overdose classification

External validation with 3,335 recent records

Outperforms traditional and general-domain models

🔎 Similar Papers

No similar papers found.