Small Lesions-aware Bidirectional Multimodal Multiscale Fusion Network for Lung Disease Classification

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses two key challenges in pulmonary disease diagnosis: (1) high risk of missing small lesions, and (2) severe dimensionality mismatch and semantic misalignment between 3D PET-CT volumes and electronic health records (EHR). To tackle these, we propose a Bidirectional Multimodal Multiscale Fusion Network (Bi-MMMSF). Bi-MMMSF constructs an image feature pyramid for multiscale representation, incorporates a novel 3D multiscale convolutional attention module to enhance discriminative feature learning from subtle lesions, and introduces a multiscale cross-attention mechanism to enable fine-grained, hierarchical alignment and complementary fusion between imaging and EHR modalities. Extensive experiments on the Lung-PET-CT-Dx dataset demonstrate that Bi-MMMSF achieves a +2.3% improvement in classification accuracy over state-of-the-art methods, validating its effectiveness and superiority in detecting small lesions and achieving robust, semantics-aware cross-modal integration.

Technology Category

Application Category

📝 Abstract
The diagnosis of medical diseases faces challenges such as the misdiagnosis of small lesions. Deep learning, particularly multimodal approaches, has shown great potential in the field of medical disease diagnosis. However, the differences in dimensionality between medical imaging and electronic health record data present challenges for effective alignment and fusion. To address these issues, we propose the Multimodal Multiscale Cross-Attention Fusion Network (MMCAF-Net). This model employs a feature pyramid structure combined with an efficient 3D multi-scale convolutional attention module to extract lesion-specific features from 3D medical images. To further enhance multimodal data integration, MMCAF-Net incorporates a multi-scale cross-attention module, which resolves dimensional inconsistencies, enabling more effective feature fusion. We evaluated MMCAF-Net on the Lung-PET-CT-Dx dataset, and the results showed a significant improvement in diagnostic accuracy, surpassing current state-of-the-art methods. The code is available at https://github.com/yjx1234/MMCAF-Net
Problem

Research questions and friction points this paper is trying to address.

Addresses misdiagnosis of small lung lesions in medical imaging
Resolves dimensional inconsistencies in multimodal medical data fusion
Improves accuracy of lung disease classification using deep learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D multi-scale convolutional attention module
Multi-scale cross-attention fusion
Feature pyramid structure integration
🔎 Similar Papers
No similar papers found.
J
Jianxun Yu
Xidian University
Ruiquan Ge
Ruiquan Ge
Hangzhou Dianzi University
Artificial intelligenceBioinformaticsHealth informationImage processingAI for life
Z
Zhipeng Wang
Hangzhou Dianzi University
C
Cheng Yang
Hangzhou Dianzi University
C
Chenyu Lin
Hangzhou Dianzi University
X
Xianjun Fu
Zhejiang College of Security Technology, School of Artificial Intelligence
Jikui Liu
Jikui Liu
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
Biomedical Big DataEarly Diagnosis of Cardiovascular DiseaseMedical Image processing and Patten
Ahmed Elazab
Ahmed Elazab
PhD, Biomedical engineering
Medical Image AnalysisComputer-aided Detection and DiagnosisMachine & Deep Learningothers
C
Changmiao Wang
Shenzhen Research Institute of Big Data