Improving Medical Visual Representations via Radiology Report Generation

📅 2023-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak visual representation learning and low diagnostic report generation accuracy in medical image analysis, this paper proposes RadTex—a vision-language pretraining model integrating a CNN-based image encoder with a Transformer-based text decoder. RadTex introduces generative bidirectional image-text mutual generation (image→text and text→image) for the first time in medical vision-language pretraining. This design jointly optimizes discriminative downstream tasks and clinically interpretable radiology report generation, enabling interactive diagnostic assistance. Leveraging domain-adapted pretraining strategies tailored to medical imaging, RadTex achieves a macro-AUC of 89.4% on CheXpert and a macro-F1 score of 0.349 for radiology report generation—substantially outperforming contrastive learning baselines. Furthermore, its lightweight and efficient decoder architecture demonstrates strong potential for clinical deployment.
📝 Abstract
Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. Contrastive learning approaches have increasingly been adopted for medical vision language pretraining (MVLP), yet recent developments in generative AI offer new modeling alternatives. This paper introduces RadTex, a CNN-encoder transformer-decoder architecture optimized for radiology. We explore bidirectional captioning as an alternative MVLP strategy and demonstrate that RadTex's captioning pretraining is competitive with established contrastive methods, achieving a CheXpert macro-AUC of 89.4%. Additionally, RadTex's lightweight text decoder not only generates clinically relevant radiology reports (macro-F1 score of 0.349), but also provides targeted, interactive responses, highlighting the utility of bidirectional captioning in advancing medical image analysis.
Problem

Research questions and friction points this paper is trying to address.

Medical Image Enhancement
Radiology Report Generation
Medical Image Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

RadTex
Bi-directional Captioning
Medical Image Analysis
🔎 Similar Papers
No similar papers found.
K
Keegan Quigley
MIT Lincoln Laboratory, Lexington, MA
Miriam Cha
Miriam Cha
MIT Lincoln Laboratory, Harvard University
J
Josh Barua
MIT Lincoln Laboratory, Lexington, MA
G
Geeticka Chauhan
MIT CSAIL, Cambridge, MA
S
Seth Berkowitz
BIDMC, Boston, MA
Steven Horng
Steven Horng
BIDMC, Boston, MA
Polina Golland
Polina Golland
Massachusetts Institute of Technology