When are radiology reports useful for training medical image classifiers?

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

This work investigates the optimal timing and mechanisms for leveraging radiology reports in medical image classification. To overcome performance bottlenecks of image-only models, we propose a systematic multimodal framework: (1) image–text alignment during pretraining, and (2) report-guided label extraction or joint modeling during fine-tuning. Experiments reveal that textual information significantly improves classification accuracy only when report content is strongly correlated with diagnostic labels; weak correlation leads to performance degradation under pretraining. In contrast, report integration during fine-tuning consistently yields greater gains than pretraining-based fusion. Crucially, this study identifies label–text correlation strength as the key boundary condition governing text-augmented performance—a finding not previously established. Moreover, fine-tuning–based report incorporation proves more robust and clinically practical than pretraining alternatives. Our results provide both theoretical insights and actionable guidelines for efficient multimodal learning in medical AI.

Technology Category

Application Category

📝 Abstract

Medical images used to train machine learning models are often accompanied by radiology reports containing rich expert annotations. However, relying on these reports as inputs for clinical prediction requires the timely manual work of a trained radiologist. This raises a natural question: when can radiology reports be leveraged during training to improve image-only classification? Prior works are limited to evaluating pre-trained image representations by fine-tuning them to predict diagnostic labels, often extracted from reports, ignoring tasks with labels that are weakly associated with the text. To address this gap, we conduct a systematic study of how radiology reports can be used during both pre-training and fine-tuning, across diagnostic and prognostic tasks (e.g., 12-month readmission), and under varying training set sizes. Our findings reveal that: (1) Leveraging reports during pre-training is beneficial for downstream classification tasks where the label is well-represented in the text; however, pre-training through explicit image-text alignment can be detrimental in settings where it's not; (2) Fine-tuning with reports can lead to significant improvements and even have a larger impact than the pre-training method in certain settings. These results provide actionable insights into when and how to leverage privileged text data to train medical image classifiers while highlighting gaps in current research.

Problem

Research questions and friction points this paper is trying to address.

Determining when radiology reports improve medical image classification during training

Addressing limitations of prior works in pre-training and fine-tuning with reports

Systematically evaluating report usage across diagnostic and prognostic medical tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematically studies radiology reports in pre-training and fine-tuning

Evaluates diagnostic and prognostic tasks with varying training sizes

Uses explicit image-text alignment for improved classification performance

🔎 Similar Papers

RadCLIP: Enhancing Radiologic Image Analysis through Contrastive Language-Image Pre-training