Incentivizing Cardiologist-Like Reasoning in MLLMs for Interpretable Echocardiographic Diagnosis

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multimodal large models in medical imaging struggle to effectively integrate quantitative echocardiographic data with clinical manifestations and lack interpretable diagnostic reasoning pathways. This work proposes a cardiologist-inspired reasoning framework that introduces a Cardiac Reasoning Template (CRT)—requiring no per-case annotations—to standardize the diagnostic process. It further designs three procedural reward mechanisms (PQtR, PQlR, and ESR) to guide a multimodal large language model via reinforcement learning toward generating clinically coherent, multi-view consistent, and visually aligned diagnostic reasoning. Evaluated on multi-view echocardiographic diagnosis across 15 complex cardiac conditions, the approach improves model performance by 48% (with a 5% gain on CardiacNet-PAH), and achieves a 93.33% clinician approval rate for its reasoning logic.

Technology Category

Application Category

📝 Abstract
Echocardiographic diagnosis is vital for cardiac screening yet remains challenging. Existing echocardiography foundation models do not effectively capture the relationships between quantitative measurements and clinical manifestations, whereas medical reasoning multimodal large language models (MLLMs) require costly construction of detailed reasoning paths and remain ineffective at directly incorporating such echocardiographic priors into their reasoning. To address these limitations, we propose a novel approach comprising Cardiac Reasoning Template (CRT) and CardiacMind to enhance MLLM's echocardiographic reasoning by introducing cardiologist-like mindset. Specifically, CRT provides stepwise canonical diagnostic procedures for complex cardiac diseases to streamline reasoning path construction without the need for costly case-by-case verification. To incentivize reasoning MLLM under CRT, we develop CardiacMind, a new reinforcement learning scheme with three novel rewards: Procedural Quantity Reward (PQtR), Procedural Quality Reward (PQlR), and Echocardiographic Semantic Reward (ESR). PQtR promotes detailed reasoning; PQlR promotes integration of evidence across views and modalities, while ESR grounds stepwise descriptions in visual content. Our methods show a 48% improvement in multiview echocardiographic diagnosis for 15 complex cardiac diseases and a 5% improvement on CardiacNet-PAH over prior methods. The user study on our method's reasoning outputs shows 93.33% clinician agreement with cardiologist-like reasoning logic. Our code will be available.
Problem

Research questions and friction points this paper is trying to address.

echocardiographic diagnosis
medical reasoning
multimodal large language models
interpretable AI
cardiologist-like reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cardiac Reasoning Template
CardiacMind
Reinforcement Learning
Multimodal LLM
Interpretable Diagnosis
🔎 Similar Papers
No similar papers found.