🤖 AI Summary
Emergency physicians face significant challenges in rapidly extracting critical clinical information from voluminous unstructured electronic health records (EHRs). Method: This paper proposes a two-stage, lightweight, fully offline clinical summarization system deployed on a dual-Jetson Nano architecture: one device performs semantic chunking–based local EHR retrieval, while the other executes a sub-7B small language model (SLM) for summary generation; lightweight socket communication and an LLM-as-Judge mechanism jointly ensure factual consistency and output quality. The system produces dual-format summaries—structured key points and contextualized narrative text. Contribution/Results: Evaluated on MIMIC-IV and real-world de-identified EHR data, the system generates accurate, comprehensive, and highly readable summaries within ~30 seconds per case. It guarantees patient privacy through complete offline operation and demonstrates strong clinical utility and edge-deployment feasibility.
📝 Abstract
Electronic health records (EHRs) contain extensive unstructured clinical data that can overwhelm emergency physicians trying to identify critical information. We present a two-stage summarization system that runs entirely on embedded devices, enabling offline clinical summarization while preserving patient privacy. In our approach, a dual-device architecture first retrieves relevant patient record sections using the Jetson Nano-R (Retrieve), then generates a structured summary on another Jetson Nano-S (Summarize), communicating via a lightweight socket link. The summarization output is two-fold: (1) a fixed-format list of critical findings, and (2) a context-specific narrative focused on the clinician's query. The retrieval stage uses locally stored EHRs, splits long notes into semantically coherent sections, and searches for the most relevant sections per query. The generation stage uses a locally hosted small language model (SLM) to produce the summary from the retrieved text, operating within the constraints of two NVIDIA Jetson devices. We first benchmarked six open-source SLMs under 7B parameters to identify viable models. We incorporated an LLM-as-Judge evaluation mechanism to assess summary quality in terms of factual accuracy, completeness, and clarity. Preliminary results on MIMIC-IV and de-identified real EHRs demonstrate that our fully offline system can effectively produce useful summaries in under 30 seconds.