Patient-Centered Summarization Framework for AI Clinical Summarization: A Mixed-Methods Design

📅 2025-10-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI-based clinical summarization models overemphasize biomedical information while neglecting patients’ values, preferences, and concerns—undermining patient-centered care. To address this gap, we propose the first patient-centered clinical dialogue summarization benchmark, integrating dual perspectives (patient and clinician) via a mixed-methods framework and a high-quality annotation protocol. Leveraging open-weight LLMs—including Llama-3.1-8B and Mistral-8B—we employ zero-shot and few-shot prompting, and rigorously evaluate outputs using ROUGE-L, BERTScore, and qualitative expert assessment. Results show that the best-performing model achieves near-expert-level completeness and fluency, with significantly improved expression of patient-centered elements; however, factual accuracy and capture of subjective aspects remain limited. This work establishes both a theoretical foundation and an implementation paradigm for trustworthy, human-centered AI in clinical summarization.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are increasingly demonstrating the potential to reach human-level performance in generating clinical summaries from patient-clinician conversations. However, these summaries often focus on patients' biology rather than their preferences, values, wishes, and concerns. To achieve patient-centered care, we propose a new standard for Artificial Intelligence (AI) clinical summarization tasks: Patient-Centered Summaries (PCS). Our objective was to develop a framework to generate PCS that capture patient values and ensure clinical utility and to assess whether current open-source LLMs can achieve human-level performance in this task. We used a mixed-methods process. Two Patient and Public Involvement groups (10 patients and 8 clinicians) in the United Kingdom participated in semi-structured interviews exploring what personal and contextual information should be included in clinical summaries and how it should be structured for clinical use. Findings informed annotation guidelines used by eight clinicians to create gold-standard PCS from 88 atrial fibrillation consultations. Sixteen consultations were used to refine a prompt aligned with the guidelines. Five open-source LLMs (Llama-3.2-3B, Llama-3.1-8B, Mistral-8B, Gemma-3-4B, and Qwen3-8B) generated summaries for 72 consultations using zero-shot and few-shot prompting, evaluated with ROUGE-L, BERTScore, and qualitative metrics. Patients emphasized lifestyle routines, social support, recent stressors, and care values. Clinicians sought concise functional, psychosocial, and emotional context. The best zero-shot performance was achieved by Mistral-8B (ROUGE-L 0.189) and Llama-3.1-8B (BERTScore 0.673); the best few-shot by Llama-3.1-8B (ROUGE-L 0.206, BERTScore 0.683). Completeness and fluency were similar between experts and models, while correctness and patient-centeredness favored human PCS.
Problem

Research questions and friction points this paper is trying to address.

Developing patient-centered AI summaries capturing patient values
Assessing open-source LLMs for clinical summarization performance
Addressing limitations of biology-focused summaries with psychosocial context
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed patient-centered AI summarization framework
Used mixed-methods with patient-clinician interviews
Evaluated open-source LLMs with specialized prompting
🔎 Similar Papers
No similar papers found.
M
Maria Lizarazo Jimenez
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
A
Ana Gabriela Claros
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
K
Kieran Green
University Hospitals Plymouth NHS Trust, Derriford Hospital, Plymouth, UK
D
David Toro-Tobon
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
F
Felipe Larios
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
S
Sheena Asthana
Centre for Health Technology, University of Plymouth, UK
C
Camila Wenczenovicz
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
K
Kerly Guevara Maldonado
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
L
Luis Vilatuna-Andrango
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
C
Cristina Proano-Velez
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
S
Satya Sai Sri Bandi
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
S
Shubhangi Bagewadi
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
Megan E. Branda
Megan E. Branda
Mayo Clinic
BiostatisticsClinical TrialsPatient Centered CareImplementation ScienceShared Decision Making
M
Misk Al Zahidy
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
Saturnino Luz
Saturnino Luz
The University of Edinburgh
Digital BiomarkersPrecision MedicineDeep PhenotypingSpeech and Signal ProcessingMachine Learning
Mirella Lapata
Mirella Lapata
School of Informatics, Edinburgh University
natural language processing
J
Juan P. Brito
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
O
Oscar J. Ponce-Ponte
Care and AI Laboratory, Knowledge and Evaluation Research Unit, Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA