🤖 AI Summary
Standard masked self-supervised learning in dental CBCT analysis struggles to focus on diagnostically critical regions with high textural complexity due to its reliance on random masking, thereby limiting the model’s ability to learn salient anatomical and pathological features. To address this, this work proposes ATMask—an adaptive masking strategy guided by an inter-layer texture variation map—that steers Masked Image Modeling (MIM) to preferentially cover high-complexity regions through a lightweight texture-aware mechanism, enhancing 3D contextual representation. The study also introduces and releases the first large-scale pretraining dataset for dental CBCT, comprising 6,314 scans. Experiments demonstrate that ATMask significantly outperforms existing self-supervised methods across three downstream tasks, achieving superior data efficiency and representation quality.
📝 Abstract
Cone Beam Computed Tomography (CBCT) is pivotal for 3D diagnostic imaging in dentistry. However, the development of robust AI models for volumetric analysis is often constrained by the scarcity of large, annotated datasets. Self-supervised learning (SSL), particularly Masked Image Modeling (MIM), offers a promising pathway to leverage unlabeled data. A limitation of standard MIM is its reliance on random masking, which fails to prioritize diagnostically critical regions in dental CBCT volumes, such as subtle pathological changes and intricate anatomical boundaries. To address this, we propose ATMask, a novel adaptive masking strategy. Instead of applying random masks or employing computationally intensive attention modules, ATMask computes an inter-slice texture variation map to identify regions with high structural or textural complexity. These high-variation areas are then selectively masked during pre-training, compelling the model to learn richer contextual representations essential for inferring complex 3D morphological transitions. Furthermore, we contribute the first large-scale CBCT dataset, curated from both public and private sources, comprising 6,314 scans, for the dental AI model pretraining. Extensive experiments on three downstream dental CBCT tasks demonstrate that our ATMask enables more data-efficient and powerful representation learning than standard random masking and other advanced SSL baselines. The dataset and code will be released.