🤖 AI Summary
To address the low spatiotemporal fidelity in high-dimensional brain imaging analysis and the poor clinical adaptability of generic models, this paper proposes the Dynamic Curriculum Learning framework with Spatiotemporal Encoding (DCL-SE). DCL-SE innovatively integrates data-driven spatiotemporal encoding (DaSE) with approximate rank pooling (ARP) to compress 3D neuroimaging volumes into information-dense 2D dynamic representations. It further introduces a Dynamic Grouping Mechanism (DGM) that guides the decoder through coarse-to-fine progressive training. Evaluated on six publicly available clinical tasks—including Alzheimer’s disease classification, brain tumor segmentation, cerebral artery segmentation, and brain age prediction—DCL-SE consistently outperforms state-of-the-art methods, achieving significant improvements in accuracy, robustness, and pathological interpretability. The framework establishes a novel paradigm for clinical-oriented neuroimaging modeling.
📝 Abstract
High-dimensional neuroimaging analyses for clinical diagnosis are often constrained by compromises in spatiotemporal fidelity and by the limited adaptability of large-scale, general-purpose models. To address these challenges, we introduce Dynamic Curriculum Learning for Spatiotemporal Encoding (DCL-SE), an end-to-end framework centered on data-driven spatiotemporal encoding (DaSE). We leverage Approximate Rank Pooling (ARP) to efficiently encode three-dimensional volumetric brain data into information-rich, two-dimensional dynamic representations, and then employ a dynamic curriculum learning strategy, guided by a Dynamic Group Mechanism (DGM), to progressively train the decoder, refining feature extraction from global anatomical structures to fine pathological details. Evaluated across six publicly available datasets, including Alzheimer's disease and brain tumor classification, cerebral artery segmentation, and brain age prediction, DCL-SE consistently outperforms existing methods in accuracy, robustness, and interpretability. These findings underscore the critical importance of compact, task-specific architectures in the era of large-scale pretrained networks.