Checkup2Action: A Multimodal Clinical Check-up Report Dataset for Patient-Oriented Action Card Generation

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study addresses the challenge that patients struggle to comprehend and act upon heterogeneous, multimodal clinical examination reports. To bridge this gap, the authors introduce a novel patient-oriented “action card” generation task, accompanied by a multimodal dataset comprising 2,000 real-world reports. Each action card structurally presents issue prioritization, recommended departments, follow-up time windows, layperson-friendly explanations, and suggested questions for physicians—explicitly avoiding diagnostic or therapeutic advice. Leveraging multimodal large language models with constrained generation techniques, the work establishes a multidimensional evaluation protocol encompassing coverage, accuracy, and safety, thereby establishing the first benchmark in this domain. Experiments reveal inherent trade-offs among general and medical large models across coverage, correctness, conciseness, and safety, demonstrating the benchmark’s efficacy in evaluating patient-centered reasoning.

📝 Abstract

Clinical check-up reports are multimodal documents that combine page layouts, tables, numerical biomarkers, abnormality flags, imaging findings, and domain-specific terminology. Such heterogeneous evidence is difficult for laypersons to interpret and translate into concrete follow-up actions. Although large language models show promise in medical summarisation and triage support, their ability to generate safe, prioritised, and patient-oriented actions from multimodal check-up reports remains under-benchmarked. We present \textbf{Checkup2Action}, a multimodal clinical check-up report dataset and benchmark for structured \textit{Action Card} generation. Each card describes one clinically relevant issue and specifies its priority, recommended department, follow-up time window, patient-facing explanation, and questions for clinicians, while avoiding diagnostic or treatment-prescriptive claims. The dataset contains 2,000 de-identified real-world check-up reports covering demographic information, physical examinations, laboratory tests, cardiovascular assessments, imaging-related evidence, and physician summaries. We formulate checkup-to-action generation as a constrained structured generation task and introduce an evaluation protocol covering issue coverage and precision, priority consistency, department and time recommendation accuracy, action complexity, usefulness, readability, and safety compliance. Experiments with general-purpose and medical large language models reveal clear trade-offs between issue coverage, action correctness, conciseness, and safety alignment. Checkup2Action provides a new multimodal benchmark for evaluating patient-oriented reasoning over clinical check-up reports.

Problem

Research questions and friction points this paper is trying to address.

multimodal clinical reports

patient-oriented action generation

action card

health literacy

clinical decision support

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal clinical report

action card generation

patient-oriented reasoning