🤖 AI Summary
In radiology report generation (RRG), poor alignment between visual features and medical semantics—as well as interference from irrelevant features—compromises diagnostic accuracy. To address this, we propose the Medical Concept Alignment Framework (MCAF), which jointly integrates anatomical contrastive learning, pathology-matching loss, and a feature-gating mechanism to explicitly model fine-grained mappings between image regions and curated anatomical/pathological concept lexicons, thereby enabling knowledge-driven semantic enhancement. Evaluated on MIMIC-CXR and CheXpert Plus, MCAF significantly improves clinical relevance and diagnostic consistency of generated reports, boosting accuracy for key anatomy–pathology descriptions by 12.7% and 9.3%, respectively—outperforming current state-of-the-art methods. Our core contribution is a principled, interpretable, and generalizable vision–semantics alignment paradigm that advances large language model (LLM)-based RRG toward clinically reliable, precision medicine reporting.
📝 Abstract
Despite significant advancements in adapting Large Language Models (LLMs) for radiology report generation (RRG), clinical adoption remains challenging due to difficulties in accurately mapping pathological and anatomical features to their corresponding text descriptions. Additionally, semantic agnostic feature extraction further hampers the generation of accurate diagnostic reports. To address these challenges, we introduce Medical Concept Aligned Radiology Report Generation (MCA-RG), a knowledge-driven framework that explicitly aligns visual features with distinct medical concepts to enhance the report generation process. MCA-RG utilizes two curated concept banks: a pathology bank containing lesion-related knowledge, and an anatomy bank with anatomical descriptions. The visual features are aligned with these medical concepts and undergo tailored enhancement. We further propose an anatomy-based contrastive learning procedure to improve the generalization of anatomical features, coupled with a matching loss for pathological features to prioritize clinically relevant regions. Additionally, a feature gating mechanism is employed to filter out low-quality concept features. Finally, the visual features are corresponding to individual medical concepts, and are leveraged to guide the report generation process. Experiments on two public benchmarks (MIMIC-CXR and CheXpert Plus) demonstrate that MCA-RG achieves superior performance, highlighting its effectiveness in radiology report generation.