🤖 AI Summary
This work addresses two key challenges in pulmonary disease diagnosis: (1) high risk of missing small lesions, and (2) severe dimensionality mismatch and semantic misalignment between 3D PET-CT volumes and electronic health records (EHR). To tackle these, we propose a Bidirectional Multimodal Multiscale Fusion Network (Bi-MMMSF). Bi-MMMSF constructs an image feature pyramid for multiscale representation, incorporates a novel 3D multiscale convolutional attention module to enhance discriminative feature learning from subtle lesions, and introduces a multiscale cross-attention mechanism to enable fine-grained, hierarchical alignment and complementary fusion between imaging and EHR modalities. Extensive experiments on the Lung-PET-CT-Dx dataset demonstrate that Bi-MMMSF achieves a +2.3% improvement in classification accuracy over state-of-the-art methods, validating its effectiveness and superiority in detecting small lesions and achieving robust, semantics-aware cross-modal integration.
📝 Abstract
The diagnosis of medical diseases faces challenges such as the misdiagnosis of small lesions. Deep learning, particularly multimodal approaches, has shown great potential in the field of medical disease diagnosis. However, the differences in dimensionality between medical imaging and electronic health record data present challenges for effective alignment and fusion. To address these issues, we propose the Multimodal Multiscale Cross-Attention Fusion Network (MMCAF-Net). This model employs a feature pyramid structure combined with an efficient 3D multi-scale convolutional attention module to extract lesion-specific features from 3D medical images. To further enhance multimodal data integration, MMCAF-Net incorporates a multi-scale cross-attention module, which resolves dimensional inconsistencies, enabling more effective feature fusion. We evaluated MMCAF-Net on the Lung-PET-CT-Dx dataset, and the results showed a significant improvement in diagnostic accuracy, surpassing current state-of-the-art methods. The code is available at https://github.com/yjx1234/MMCAF-Net