🤖 AI Summary
Early osteoporosis diagnosis is critical for preventing geriatric fractures, yet hindered by scarce labeled data and challenges in fusing heterogeneous multimodal data. This paper proposes a clinically grounded, interpretable dual-path multimodal learning framework: one pathway extracts features from X-ray images using VGG19, InceptionV3, or ResNet50; the other encodes standardized clinical variables. Both pathways undergo PCA-based dimensionality reduction, followed by a K-means–guided representative feature selection mechanism and end-to-end classification via fully connected networks. SHAP analysis identifies BMI, prior medical history, and height as the most discriminative clinical factors. Experiments demonstrate that clinical features dominate predictive performance—contributing significantly more than imaging features—while simultaneously enhancing both accuracy and interpretability. The framework delivers a trustworthy, deployable AI solution for primary-care osteoporosis screening.
📝 Abstract
Osteoporosis is a common condition that increases fracture risk, especially in older adults. Early diagnosis is vital for preventing fractures, reducing treatment costs, and preserving mobility. However, healthcare providers face challenges like limited labeled data and difficulties in processing medical images. This study presents a novel multi-modal learning framework that integrates clinical and imaging data to improve diagnostic accuracy and model interpretability. The model utilizes three pre-trained networks-VGG19, InceptionV3, and ResNet50-to extract deep features from X-ray images. These features are transformed using PCA to reduce dimensionality and focus on the most relevant components. A clustering-based selection process identifies the most representative components, which are then combined with preprocessed clinical data and processed through a fully connected network (FCN) for final classification. A feature importance plot highlights key variables, showing that Medical History, BMI, and Height were the main contributors, emphasizing the significance of patient-specific data. While imaging features were valuable, they had lower importance, indicating that clinical data are crucial for accurate predictions. This framework promotes precise and interpretable predictions, enhancing transparency and building trust in AI-driven diagnoses for clinical integration.