🤖 AI Summary
This work addresses the cross-domain generalization bottleneck between synthetic and real clinical head X-ray images—arising from discrepancies in attenuation characteristics, noise patterns, and soft-tissue representation—by proposing an unpaired, high-fidelity domain translation method. Methodologically, it innovatively integrates Flow Matching with Schrödinger Bridges to construct a class-conditional, domain-agnostic shared latent space, enabling seamless translation across arbitrary training domains. An implicit conditional generation mechanism is further introduced to dynamically balance perceptual quality and structural fidelity during inference. Evaluated on the in-house X-DigiSkull dataset, the model demonstrates lightweight efficiency and significantly outperforms state-of-the-art methods in translation fidelity, anatomical structure consistency, and cross-domain generalization. This work establishes a scalable, principled paradigm for medical image domain adaptation.
📝 Abstract
Synthetic medical data offers a scalable solution for training robust models, but significant domain gaps limit its generalizability to real-world clinical settings. This paper addresses the challenge of cross-domain translation between synthetic and real X-ray images of the head, focusing on bridging discrepancies in attenuation behavior, noise characteristics, and soft tissue representation. We propose MedShift, a unified class-conditional generative model based on Flow Matching and Schrodinger Bridges, which enables high-fidelity, unpaired image translation across multiple domains. Unlike prior approaches that require domain-specific training or rely on paired data, MedShift learns a shared domain-agnostic latent space and supports seamless translation between any pair of domains seen during training. We introduce X-DigiSkull, a new dataset comprising aligned synthetic and real skull X-rays under varying radiation doses, to benchmark domain translation models. Experimental results demonstrate that, despite its smaller model size compared to diffusion-based approaches, MedShift offers strong performance and remains flexible at inference time, as it can be tuned to prioritize either perceptual fidelity or structural consistency, making it a scalable and generalizable solution for domain adaptation in medical imaging. The code and dataset are available at https://caetas.github.io/medshift.html