🤖 AI Summary
To address the low accuracy and heavy reliance on manual priors in reconstructing 3D anatomical structures from 2D X-ray images, this paper proposes the first end-to-end 2D→3D medical image reconstruction framework. Methodologically, it innovatively integrates a 2D Swin Transformer encoder with a 3D cross-attention decoder and introduces a dimension-expansion module to enable lossless pixel-to-voxel mapping—eliminating handcrafted features and explicit anatomical modeling. Evaluated across nine public datasets covering femur, hip joint, spine, and ribs (54 anatomical classes), the method achieves substantial improvements over state-of-the-art approaches: +3.2–7.8% in 3D segmentation Dice score and 21–39% reduction in key morphometric parameter errors (e.g., angles, lengths). The framework demonstrates clinical interpretability and practical deployability.
📝 Abstract
The conversion from 2D X-ray to 3D shape holds significant potential for improving diagnostic efficiency and safety. However, existing reconstruction methods often rely on hand-crafted features, manual intervention, and prior knowledge, resulting in unstable shape errors and additional processing costs. In this paper, we introduce Swin-X2S, an end-to-end deep learning method for directly reconstructing 3D segmentation and labeling from 2D biplanar orthogonal X-ray images. Swin-X2S employs an encoder-decoder architecture: the encoder leverages 2D Swin Transformer for X-ray information extraction, while the decoder employs 3D convolution with cross-attention to integrate structural features from orthogonal views. A dimension-expanding module is introduced to bridge the encoder and decoder, ensuring a smooth conversion from 2D pixels to 3D voxels. We evaluate proposed method through extensive qualitative and quantitative experiments across nine publicly available datasets covering four anatomies (femur, hip, spine, and rib), with a total of 54 categories. Significant improvements over previous methods have been observed not only in the segmentation and labeling metrics but also in the clinically relevant parameters that are of primary concern in practical applications, which demonstrates the promise of Swin-X2S to provide an effective option for anatomical shape reconstruction in clinical scenarios. Code implementation is available at: url{https://github.com/liukuan5625/Swin-X2S}.