🤖 AI Summary
High-resolution deformable medical image registration faces challenges in accurately aligning fine-grained anatomical structures while incurring prohibitive computational and memory overhead. To address this, we propose the Multi-Axis Cross-Covariance Attention (MAXCA) module—a novel lightweight, plug-and-play component that for the first time integrates regional attention with dilated Cross-Covariance Attention (XCA) within a multi-axis parallel architecture. MAXCA simultaneously captures both global and local long-range dependencies with linear computational complexity. It can be seamlessly embedded into mainstream registration networks without architectural modification. Extensive evaluation across seven public medical imaging datasets and two major benchmarks—cross-subject and intra-subject registration—demonstrates consistent state-of-the-art performance. Notably, MAXCA achieves substantial improvements in deformation modeling and alignment accuracy for subtle anatomical structures, including white matter fibers and hepatic portal vein branches in brain and abdominal scans.
📝 Abstract
Deformable image registration is a fundamental requirement for medical image analysis. Recently, transformers have been widely used in deep learning-based registration methods for their ability to capture long-range dependency via self-attention (SA). However, the high computation and memory loads of SA (growing quadratically with the spatial resolution) hinder transformers from processing subtle textural information in high-resolution image features, e.g., at the full and half image resolutions. This limits deformable registration as the high-resolution textural information is crucial for finding precise pixel-wise correspondence between subtle anatomical structures. Cross-covariance Attention (XCA), as a"transposed"version of SA that operates across feature channels, has complexity growing linearly with the spatial resolution, providing the feasibility of capturing long-range dependency among high-resolution image features. However, existing XCA-based transformers merely capture coarse global long-range dependency, which are unsuitable for deformable image registration relying primarily on fine-grained local correspondence. In this study, we propose to improve existing deep learning-based registration methods by embedding a new XCA mechanism. To this end, we design an XCA-based transformer block optimized for deformable medical image registration, named Multi-Axis XCA (MAXCA). Our MAXCA serves as a general network block that can be embedded into various registration network architectures. It can capture both global and local long-range dependency among high-resolution image features by applying regional and dilated XCA in parallel via a multi-axis design. Extensive experiments on two well-benchmarked inter-/intra-patient registration tasks with seven public medical datasets demonstrate that our MAXCA block enables state-of-the-art registration performance.