MR-CLIP: Efficient Metadata-Guided Learning of MRI Contrast Representations

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
MRI contrast interpretation heavily relies on DICOM metadata (e.g., TR/TE), yet such parameters are frequently missing, noisy, or coarsely labeled (e.g., “T1w”) in clinical datasets—hindering image parsing, retrieval, and cross-center integration. To address this, we propose the first unsupervised contrastive learning framework grounded exclusively in raw DICOM acquisition parameters. Our method jointly models image content and scanner-specific acquisition settings, learning fine-grained, anatomy-invariant contrast-aware representations without human annotations. By integrating multimodal contrastive learning, it yields robust, protocol- and scanner-agnostic representations, enabling modality-agnostic representation learning and data harmonization. Evaluated on contrast classification and cross-protocol retrieval tasks, our approach significantly outperforms existing baselines while demonstrating strong scalability. The code and pretrained models are publicly available.

Technology Category

Application Category

📝 Abstract
Accurate interpretation of Magnetic Resonance Imaging scans in clinical systems is based on a precise understanding of image contrast. This contrast is primarily governed by acquisition parameters, such as echo time and repetition time, which are stored in the DICOM metadata. To simplify contrast identification, broad labels such as T1-weighted or T2-weighted are commonly used, but these offer only a coarse approximation of the underlying acquisition settings. In many real-world datasets, such labels are entirely missing, leaving raw acquisition parameters as the only indicators of contrast. Adding to this challenge, the available metadata is often incomplete, noisy, or inconsistent. The lack of reliable and standardized metadata complicates tasks such as image interpretation, retrieval, and integration into clinical workflows. Furthermore, robust contrast-aware representations are essential to enable more advanced clinical applications, such as achieving modality-invariant representations and data harmonization. To address these challenges, we propose MR-CLIP, a multimodal contrastive learning framework that aligns MR images with their DICOM metadata to learn contrast-aware representations, without relying on manual labels. Trained on a diverse clinical dataset that spans various scanners and protocols, MR-CLIP captures contrast variations across acquisitions and within scans, enabling anatomy-invariant representations. We demonstrate its effectiveness in cross-modal retrieval and contrast classification, highlighting its scalability and potential for further clinical applications. The code and weights are publicly available at https://github.com/myigitavci/MR-CLIP.
Problem

Research questions and friction points this paper is trying to address.

Lack of reliable MRI contrast metadata labels
Incomplete and noisy DICOM acquisition parameters
Need for robust contrast-aware image representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal contrastive learning for MRI metadata alignment
DICOM metadata guides contrast-aware representation learning
Anatomy-invariant representations from diverse clinical datasets
🔎 Similar Papers
No similar papers found.