Dual-stage and Lightweight Patient Chart Summarization for Emergency Physicians

📅 2025-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Emergency physicians face significant challenges in rapidly extracting critical clinical information from voluminous unstructured electronic health records (EHRs). Method: This paper proposes a two-stage, lightweight, fully offline clinical summarization system deployed on a dual-Jetson Nano architecture: one device performs semantic chunking–based local EHR retrieval, while the other executes a sub-7B small language model (SLM) for summary generation; lightweight socket communication and an LLM-as-Judge mechanism jointly ensure factual consistency and output quality. The system produces dual-format summaries—structured key points and contextualized narrative text. Contribution/Results: Evaluated on MIMIC-IV and real-world de-identified EHR data, the system generates accurate, comprehensive, and highly readable summaries within ~30 seconds per case. It guarantees patient privacy through complete offline operation and demonstrates strong clinical utility and edge-deployment feasibility.

Technology Category

Application Category

📝 Abstract
Electronic health records (EHRs) contain extensive unstructured clinical data that can overwhelm emergency physicians trying to identify critical information. We present a two-stage summarization system that runs entirely on embedded devices, enabling offline clinical summarization while preserving patient privacy. In our approach, a dual-device architecture first retrieves relevant patient record sections using the Jetson Nano-R (Retrieve), then generates a structured summary on another Jetson Nano-S (Summarize), communicating via a lightweight socket link. The summarization output is two-fold: (1) a fixed-format list of critical findings, and (2) a context-specific narrative focused on the clinician's query. The retrieval stage uses locally stored EHRs, splits long notes into semantically coherent sections, and searches for the most relevant sections per query. The generation stage uses a locally hosted small language model (SLM) to produce the summary from the retrieved text, operating within the constraints of two NVIDIA Jetson devices. We first benchmarked six open-source SLMs under 7B parameters to identify viable models. We incorporated an LLM-as-Judge evaluation mechanism to assess summary quality in terms of factual accuracy, completeness, and clarity. Preliminary results on MIMIC-IV and de-identified real EHRs demonstrate that our fully offline system can effectively produce useful summaries in under 30 seconds.
Problem

Research questions and friction points this paper is trying to address.

Summarizes extensive EHR data for emergency physicians
Generates structured summaries offline on embedded devices
Retrieves relevant clinical sections and creates query-focused narratives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-device architecture for offline clinical summarization
Retrieval and generation stages using lightweight socket link
Locally hosted small language model for privacy preservation
🔎 Similar Papers
No similar papers found.
J
Jiajun Wu
Department of Electrical and Software Engineering, University of Calgary, Calgary, AB, Canada
S
Swaleh Zaidi
Department of Electrical and Software Engineering, University of Calgary, Calgary, AB, Canada
B
Braden Teitge
Rockview General Hospital, Calgary, AB, Canada
Henry Leung
Henry Leung
Sydney University
Asset PricingCorporate Finance
Jiayu Zhou
Jiayu Zhou
University of Michigan
Machine LearningAI + Health Informatics
J
Jessalyn Holodinsky
Department of Emergency Medicine, University of Calgary, Calgary, AB, Canada
Steve Drew
Steve Drew
Assistant Professor at University of Calgary
Edge AIIoTMachine Learning