Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Excessive clinical documentation burden impedes healthcare efficiency. Method: We propose a large language model (LLM)-based automation framework comprising (1) structured table generation from nurse verbal notes and (2) precise medical instruction extraction from physician–patient consultation transcripts. To address data scarcity and privacy constraints, we design an intelligent agent pipeline that synthesizes high-fidelity, de-identified, non-sensitive spoken clinical data. We release SYNUR—the first open-source dataset for nurse observation summarization—and SIMORD—the first dedicated benchmark for medical instruction extraction. We systematically evaluate both open-weight (e.g., Qwen, Llama) and proprietary (e.g., GPT-4o, o1) LLMs on real-world clinical tasks. Contribution/Results: Experiments demonstrate LLMs’ effectiveness on these high-value clinical NLP tasks, establishing a reproducible, scalable pathway toward structured electronic health record generation. Our work fills critical gaps in high-quality annotated clinical datasets and open evaluation benchmarks.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) such as GPT-4o and o1 have demonstrated strong performance on clinical natural language processing (NLP) tasks across multiple medical benchmarks. Nonetheless, two high-impact NLP tasks - structured tabular reporting from nurse dictations and medical order extraction from doctor-patient consultations - remain underexplored due to data scarcity and sensitivity, despite active industry efforts. Practical solutions to these real-world clinical tasks can significantly reduce the documentation burden on healthcare providers, allowing greater focus on patient care. In this paper, we investigate these two challenging tasks using private and open-source clinical datasets, evaluating the performance of both open- and closed-weight LLMs, and analyzing their respective strengths and limitations. Furthermore, we propose an agentic pipeline for generating realistic, non-sensitive nurse dictations, enabling structured extraction of clinical observations. To support further research in both areas, we release SYNUR and SIMORD, the first open-source datasets for nurse observation extraction and medical order extraction.
Problem

Research questions and friction points this paper is trying to address.

Structuring nurse dictations into tabular reports
Extracting medical orders from doctor-patient consultations
Generating synthetic clinical data for NLP research
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs for clinical NLP tasks
Agentic pipeline for nurse dictations
Open-source datasets SYNUR and SIMORD
🔎 Similar Papers
No similar papers found.
Jean-Philippe Corbeil
Jean-Philippe Corbeil
Microsoft
natural language processingdeep learningmachine learning
Asma Ben Abacha
Asma Ben Abacha
Microsoft
Artificial IntelligenceNatural Language ProcessingMedical Informatics
G
George Michalopoulos
Microsoft Healthcare & Life Sciences
P
Phillip Swazinna
Microsoft Healthcare & Life Sciences
M
Miguel Del-Agua
Microsoft Healthcare & Life Sciences
J
Jerome Tremblay
Microsoft Healthcare & Life Sciences
A
Akila Jeeson Daniel
Microsoft Healthcare & Life Sciences
C
Cari Bader
Microsoft Healthcare & Life Sciences
K
Kevin Cho
Microsoft Healthcare & Life Sciences
P
Pooja Krishnan
Microsoft Healthcare & Life Sciences
Nathan Bodenstab
Nathan Bodenstab
Microsoft Healthcare & Life Sciences
T
Thomas Lin
Microsoft Healthcare & Life Sciences
W
Wenxuan Teng
Microsoft Healthcare & Life Sciences
F
Francois Beaulieu
Microsoft Healthcare & Life Sciences
P
Paul Vozila
Microsoft Healthcare & Life Sciences