🤖 AI Summary
This work addresses the limitations of conventional deep learning approaches for MRI, which rely on reconstructed magnitude images, discard phase information, incur high computational costs, and struggle to capture the global non-local characteristics of k-space data. To overcome these challenges, the authors propose a complex-domain Vision Transformer (kViT) that enables end-to-end classification directly in k-space—a first in the field—and introduce a radial patch partitioning strategy aligned with the energy distribution of MRI frequency data. Evaluated on both the fastMRI benchmark and an internal dataset, kViT achieves classification performance on par with state-of-the-art image-domain models, demonstrates superior robustness under high acceleration factors, and reduces training memory consumption by up to 68×.
📝 Abstract
Deep learning applications in Magnetic Resonance Imaging (MRI) predominantly operate on reconstructed magnitude images, a process that discards phase information and requires computationally expensive transforms. Standard neural network architectures rely on local operations (convolutions or grid-patches) that are ill-suited for the global, non-local nature of raw frequency-domain (k-Space) data. In this work, we propose a novel complex-valued Vision Transformer (kViT) designed to perform classification directly on k-Space data. To bridge the geometric disconnect between current architectures and MRI physics, we introduce a radial k-Space patching strategy that respects the spectral energy distribution of the frequency-domain. Extensive experiments on the fastMRI and in-house datasets demonstrate that our approach achieves classification performance competitive with state-of-the-art image-domain baselines (ResNet, EfficientNet, ViT). Crucially, kViT exhibits superior robustness to high acceleration factors and offers a paradigm shift in computational efficiency, reducing VRAM consumption during training by up to 68$\times$ compared to standard methods. This establishes a pathway for resource-efficient, direct-from-scanner AI analysis.