Extracting OPQRST in Electronic Health Records using Large Language Models with Reasoning

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Extracting OPQRST assessment information from electronic health records (EHRs) is challenging due to data’s extreme unstructuredness and limitations of conventional sequence-labeling methods in semantic understanding and clinical alignment. To address this, we propose a large language model (LLM)-based generative extraction framework. Our method reformulates OPQRST extraction as a text generation task explicitly incorporating clinical reasoning—thereby modeling physicians’ diagnostic questioning logic. Crucially, we replace traditional named entity recognition (NER) metrics with semantic similarity measures (e.g., BERTScore) for evaluation, enhancing reliability and clinical interpretability—particularly in low-resource, few-shot settings. Experiments on real-world EHR data demonstrate significant improvements in both accuracy and readability of extracted OPQRST elements, alongside superior clinical adaptability. This work establishes a generalizable paradigm for intelligent clinical information processing in EHRs.

Technology Category

Application Category

📝 Abstract
The extraction of critical patient information from Electronic Health Records (EHRs) poses significant challenges due to the complexity and unstructured nature of the data. Traditional machine learning approaches often fail to capture pertinent details efficiently, making it difficult for clinicians to utilize these tools effectively in patient care. This paper introduces a novel approach to extracting the OPQRST assessment from EHRs by leveraging the capabilities of Large Language Models (LLMs). We propose to reframe the task from sequence labeling to text generation, enabling the models to provide reasoning steps that mimic a physician's cognitive processes. This approach enhances interpretability and adapts to the limited availability of labeled data in healthcare settings. Furthermore, we address the challenge of evaluating the accuracy of machine-generated text in clinical contexts by proposing a modification to traditional Named Entity Recognition (NER) metrics. This includes the integration of semantic similarity measures, such as the BERT Score, to assess the alignment between generated text and the clinical intent of the original records. Our contributions demonstrate a significant advancement in the use of AI in healthcare, offering a scalable solution that improves the accuracy and usability of information extraction from EHRs, thereby aiding clinicians in making more informed decisions and enhancing patient care outcomes.
Problem

Research questions and friction points this paper is trying to address.

Extracting OPQRST assessment from unstructured EHR data
Improving interpretability with LLM reasoning mimicking physicians
Evaluating clinical text accuracy with semantic similarity metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs reframe extraction as text generation
Integrates reasoning steps mimicking physician cognition
Uses semantic similarity metrics for clinical evaluation
🔎 Similar Papers
No similar papers found.