GANeXt: A Fully ConvNeXt-Enhanced Generative Adversarial Network for MRI- and CBCT-to-CT Synthesis

📅 2025-12-22

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address the anatomical fidelity challenge in MRI-to-CT and CBCT-to-CT cross-modal 3D synthesis for adaptive radiotherapy, this work proposes a fully ConvNeXt-driven 3D conditional GAN framework. Methodologically, it employs a U-shaped ConvNeXt generator and a multi-head segmentation discriminator jointly optimized with Dice loss and cross-entropy; introduces a segmentation-guided masked MAE loss; and integrates perceptual, adversarial, and MAE losses. Sliding-window inference with average-fold reconstruction ensures volumetric consistency. Key contributions include: (i) the first fully ConvNeXt-based 3D GAN architecture; (ii) a masked MAE loss enabling dual fidelity—structural and tissue-level; and (iii) a multi-head segmentation discriminator enhancing anatomical specificity. The method achieves stable convergence without fine-tuning on multi-center data (3,000 epochs for MRI-to-CT; 1,000 for CBCT-to-CT), yielding clinically acceptable CT synthesis accuracy for radiotherapy dose calculation.

Technology Category

Application Category

📝 Abstract

The synthesis of computed tomography (CT) from magnetic resonance imaging (MRI) and cone-beam CT (CBCT) plays a critical role in clinical treatment planning by enabling accurate anatomical representation in adaptive radiotherapy. In this work, we propose GANeXt, a 3D patch-based, fully ConvNeXt-powered generative adversarial network for unified CT synthesis across different modalities and anatomical regions. Specifically, GANeXt employs an efficient U-shaped generator constructed from stacked 3D ConvNeXt blocks with compact convolution kernels, while the discriminator adopts a conditional PatchGAN. To improve synthesis quality, we incorporate a combination of loss functions, including mean absolute error (MAE), perceptual loss, segmentation-based masked MAE, and adversarial loss and a combination of Dice loss and cross-entropy for multi-head segmentation discriminator. For both tasks, training is performed with a batch size of 8 using two separate AdamW optimizers for the generator and discriminator, each equipped with a warmup and cosine decay scheduler, with learning rates of $5 imes10^{-4}$ and $1 imes10^{-3}$, respectively. Data preprocessing includes deformable registration, foreground cropping, percentile normalization for the input modality, and linear normalization of the CT to the range $[-1024, 1000]$. Data augmentation involves random zooming within $(0.8, 1.3)$ (for MRI-to-CT only), fixed-size cropping to $32 imes160 imes192$ for MRI-to-CT and $32 imes128 imes128$ for CBCT-to-CT, and random flipping. During inference, we apply a sliding-window approach with $0.8$ overlap and average folding to reconstruct the full-size sCT, followed by inversion of the CT normalization. After joint training on all regions without any fine-tuning, the final models are selected at the end of 3000 epochs for MRI-to-CT and 1000 epochs for CBCT-to-CT using the full training dataset.

Problem

Research questions and friction points this paper is trying to address.

Synthesizes CT from MRI and CBCT for radiotherapy planning

Uses a 3D ConvNeXt-based GAN for cross-modality image translation

Improves synthesis with combined losses and multi-head discriminator

Innovation

Methods, ideas, or system contributions that make the work stand out.

U-shaped generator with 3D ConvNeXt blocks

Combined loss functions including masked MAE

Sliding-window inference with overlap averaging

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

PhD – Generative Models for Closed-loop Synthesis

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)