🤖 AI Summary
Existing self-supervised medical imaging methods overlook the composability and decomposability of anatomical structures, leading to inadequate anatomical modeling and limiting few-shot learning, fine-tuning efficacy, and clinical interpretability. To address this, we propose an anatomy-consistent self-supervised learning framework: (1) a grid-based image cropping strategy explicitly models multi-scale anatomical composition relationships; (2) a global–local dual-branch architecture jointly enforces macroscopic structural consistency and patch-level feature matrix alignment; and (3) hierarchical semantic bridging narrows the gap between high-level pathological semantics and low-level tissue-level abnormalities. Evaluated across six diverse medical imaging datasets and two backbone architectures, our method significantly improves few-shot classification accuracy, transfer learning performance, and cross-domain robustness. Moreover, it enhances model interpretability—enabling clinically meaningful attention patterns—and demonstrates strong practical potential for real-world deployment.
📝 Abstract
Medical images acquired from standardized protocols show consistent macroscopic or microscopic anatomical structures, and these structures consist of composable/decomposable organs and tissues, but existing self-supervised learning (SSL) methods do not appreciate such composable/decomposable structure attributes inherent to medical images. To overcome this limitation, this paper introduces a novel SSL approach called ACE to learn anatomically consistent embedding via composition and decomposition with two key branches: (1) global consistency, capturing discriminative macro-structures via extracting global features; (2) local consistency, learning fine-grained anatomical details from composable/decomposable patch features via corresponding matrix matching. Experimental results across 6 datasets 2 backbones, evaluated in few-shot learning, fine-tuning, and property analysis, show ACE's superior robustness, transferability, and clinical potential. The innovations of our ACE lie in grid-wise image cropping, leveraging the intrinsic properties of compositionality and decompositionality of medical images, bridging the semantic gap from high-level pathologies to low-level tissue anomalies, and providing a new SSL method for medical imaging.