Multimodal Contrastive Pretraining of CBCT and IOS for Enhanced Tooth Segmentation

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing dental segmentation methods lack rigorous validation on both cone-beam computed tomography (CBCT) and intraoral scanning (IOS) data, suffering from low accuracy and poor clinical applicability—hindering high-fidelity digital dentistry modeling. To address this, we propose ToothMCL, a novel multimodal contrastive pretraining framework that achieves, for the first time, cross-modal representation alignment between CBCT (voxel-based) and IOS (mesh/point-cloud) modalities. Leveraging self-supervised contrastive learning, ToothMCL explicitly captures anatomical consistency to construct modality-invariant features. The framework supports fine-grained multi-class tooth segmentation and FDI notation identification. In internal and multi-center external evaluations, ToothMCL improves Dice scores by 12% on CBCT and 8% on IOS over state-of-the-art methods. It further demonstrates strong robustness across diverse imaging devices, scan qualities, and pathological conditions, validating its clinical generalizability and practical utility in real-world digital dentistry workflows.

Technology Category

Application Category

📝 Abstract
Digital dentistry represents a transformative shift in modern dental practice. The foundational step in this transformation is the accurate digital representation of the patient's dentition, which is obtained from segmented Cone-Beam Computed Tomography (CBCT) and Intraoral Scans (IOS). Despite the growing interest in digital dental technologies, existing segmentation methodologies frequently lack rigorous validation and demonstrate limited performance and clinical applicability. To the best of our knowledge, this is the first work to introduce a multimodal pretraining framework for tooth segmentation. We present ToothMCL, a Tooth Multimodal Contrastive Learning for pretraining that integrates volumetric (CBCT) and surface-based (IOS) modalities. By capturing modality-invariant representations through multimodal contrastive learning, our approach effectively models fine-grained anatomical features, enabling precise multi-class segmentation and accurate identification of Fédération Dentaire Internationale (FDI) tooth numbering. Along with the framework, we curated CBCT-IOS3.8K, the largest paired CBCT and IOS dataset to date, comprising 3,867 patients. We then evaluated ToothMCL on a comprehensive collection of independent datasets, representing the largest and most diverse evaluation to date. Our method achieves state-of-the-art performance in both internal and external testing, with an increase of 12% for CBCT segmentation and 8% for IOS segmentation in the Dice Similarity Coefficient (DSC). Furthermore, ToothMCL consistently surpasses existing approaches in tooth groups and demonstrates robust generalizability across varying imaging conditions and clinical scenarios.
Problem

Research questions and friction points this paper is trying to address.

Accurate tooth segmentation from CBCT and IOS scans
Limited performance of existing dental segmentation methods
Need for precise multi-class tooth identification and numbering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal contrastive learning framework
Integrates volumetric and surface-based modalities
Largest paired CBCT-IOS dataset curation
🔎 Similar Papers
No similar papers found.
M
Moo Hyun Son
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
J
Juyoung Bae
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
Z
Zelin Qiu
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
J
Jiale Peng
Division of Paediatric Dentistry and Orthodontics, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China.
K
Kai Xin Li
Delun Dental Hospital, Guangzhou, China.
Yifan Lin
Yifan Lin
Google
Machine LearningDistributed systemsAdvertising
H
Hao Chen
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.; Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.; Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China.; HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Futian, Shenzen, China.; State Key Laboratory of Nervous System Disorders, The Hong Kong Uni