An Arbitrary-Modal Fusion Network for Volumetric Cranial Nerves Tract Segmentation

📅 2025-05-05

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

In clinical practice, acquiring complete multimodal neuroimaging data (e.g., T1-weighted + diffusion MRI) is often hindered by equipment limitations, privacy constraints, or acquisition conditions, severely limiting the 3D segmentation performance of cranial nerve tracts. To address this, we propose CNTSeg-v2, a unified multimodal segmentation framework. It adopts T1-weighted imaging as the primary modality and introduces an Arbitrary-modality Collaboration Module (ACM) for adaptive feature selection and fusion across available modalities. Furthermore, a Depth-aware Distance-guided Multi-stage Decoder (DDM) is incorporated, leveraging Signed Distance Maps (SDMs) to enhance fine-grained structural continuity modeling. Built upon a 3D U-Net architecture, CNTSeg-v2 employs a T1w-dominated multi-modal collaborative learning strategy. Evaluated on the Human Connectome Project (HCP) and clinical MDM datasets, it achieves state-of-the-art performance, significantly improving segmentation accuracy and trajectory completeness—particularly for small-scale cranial nerve bundles.

Technology Category

Application Category

📝 Abstract

The segmentation of cranial nerves (CNs) tract provides a valuable quantitative tool for the analysis of the morphology and trajectory of individual CNs. Multimodal CNs tract segmentation networks, e.g., CNTSeg, which combine structural Magnetic Resonance Imaging (MRI) and diffusion MRI, have achieved promising segmentation performance. However, it is laborious or even infeasible to collect complete multimodal data in clinical practice due to limitations in equipment, user privacy, and working conditions. In this work, we propose a novel arbitrary-modal fusion network for volumetric CNs tract segmentation, called CNTSeg-v2, which trains one model to handle different combinations of available modalities. Instead of directly combining all the modalities, we select T1-weighted (T1w) images as the primary modality due to its simplicity in data acquisition and contribution most to the results, which supervises the information selection of other auxiliary modalities. Our model encompasses an Arbitrary-Modal Collaboration Module (ACM) designed to effectively extract informative features from other auxiliary modalities, guided by the supervision of T1w images. Meanwhile, we construct a Deep Distance-guided Multi-stage (DDM) decoder to correct small errors and discontinuities through signed distance maps to improve segmentation accuracy. We evaluate our CNTSeg-v2 on the Human Connectome Project (HCP) dataset and the clinical Multi-shell Diffusion MRI (MDM) dataset. Extensive experimental results show that our CNTSeg-v2 achieves state-of-the-art segmentation performance, outperforming all competing methods.

Problem

Research questions and friction points this paper is trying to address.

Segments cranial nerves tracts using arbitrary input modalities

Improves accuracy with T1w-guided auxiliary modality fusion

Addresses incomplete multimodal data challenges in clinical settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Arbitrary-modal fusion network for CNs segmentation

T1w images supervise auxiliary modalities selection

Deep Distance-guided Multi-stage decoder improves accuracy

🔎 Similar Papers

DeepThalamus: A novel deep learning method for automatic segmentation of brain thalamic nuclei from multimodal ultra-high resolution MRI