Fake It Right: Injecting Anatomical Logic into Synthetic Supervised Pre-training for Medical Segmentation

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work proposes an anatomy-informed synthetic supervised pre-training framework that addresses the limitations of existing methods, which rely on generic geometric shapes and fail to capture the morphological complexity, spatial layout, and inter-organ relationships inherent in real anatomical structures, thereby lacking the global structural priors essential for medical imaging. By integrating anatomical logic—such as spatial anchors and organ topology graphs—into the synthetic data generation process, the framework leverages a lightweight repository of realistic anatomical shapes and a structure-aware sequential placement strategy to enhance physiological plausibility. Evaluated on the Vision Transformer architecture, the method outperforms the current state-of-the-art FDSL baseline by 1.74% on BTCV and surpasses SSL approaches by 1.66% on MSD, while demonstrating robust scalability with increasing synthetic data volume.

Technology Category

Application Category

📝 Abstract

Vision Transformers (ViTs) excel in 3D medical segmentation but require massive annotated datasets. While Self-Supervised Learning (SSL) mitigates this using unlabeled data, it still faces strict privacy and logistical barriers. Formula-Driven Supervised Learning (FDSL) offers a privacy-preserving alternative by pre-training on synthetic mathematical primitives. However, a critical semantic gap limits its efficacy: generic shapes lack the morphological fidelity, fixed spatial layouts, and inter-organ relationships of real anatomy, preventing models from learning essential global structural priors. To bridge this gap, we propose an Anatomy-Informed Synthetic Supervised Pre-training framework unifying FDSL's infinite scalability with anatomical realism. We replace basic primitives with a lightweight shape bank with de-identified, label-only segmentation masks from 5 subjects. Furthermore, we introduce a structure-aware sequential placement strategy to govern the patch synthesis process. Instead of random placement, we enforce physiological plausibility using spatial anchors for correct localization and a topological graph to manage inter-organ interactions (e.g., preventing impossible overlaps). Extensive experiments on BTCV and MSD datasets demonstrate that our method significantly outperforms state-of-the-art FDSL baselines and SSL methods by 1.74\% and up to 1.66\%, while exhibiting a robust scaling effect where performance improves with increased synthetic data volume. This provides a data-efficient, privacy-compliant solution for medical segmentation. The code will be made publicly available upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

medical segmentation

synthetic data

anatomical realism

structural priors

privacy-preserving

Innovation

Methods, ideas, or system contributions that make the work stand out.

Anatomy-Informed Synthesis

Synthetic Supervised Pre-training

Structure-Aware Placement