π€ AI Summary
Clinical hip fracture risk prediction models often suffer from limited generalizability across cohorts due to distributional shifts arising from differences in geography, population demographics, and measurement protocols. To address this challenge, this study proposes an unsupervised domain adaptation fusion strategy that requires no target-domain labels, leveraging multimodal clinical and DXA data from three large cohortsβSOF, MrOS, and UK Biobank. The approach integrates Maximum Mean Discrepancy (MMD), Correlation Alignment (CORAL), and Domain-Adversarial Neural Networks (DANN) to align feature distributions across domains. Evaluated under realistic deployment conditions, the method significantly enhances model transfer stability and efficiency, overcoming the reliance on supervised fine-tuning. It achieves AUCs of 0.88 and 0.95 when transferring from male and female source cohorts, respectively, substantially outperforming non-adapted baselines and demonstrating the efficacy of multi-method fusion in improving cross-cohort generalizability.
π Abstract
Clinical risk prediction models often fail to be generalized across cohorts because underlying data distributions differ by clinical site, region, demographics, and measurement protocols. This limitation is particularly pronounced in hip fracture risk prediction, where the performance of models trained on one cohort (the source cohort) can degrade substantially when deployed in other cohorts (target cohorts). We used a shared set of clinical and DXA-derived features across three large cohorts - the Study of Osteoporotic Fractures (SOF), the Osteoporotic Fractures in Men Study (MrOS), and the UK Biobank (UKB), to systematically evaluate the performance of three domain adaptation methods - Maximum Mean Discrepancy (MMD), Correlation Alignment (CORAL), and Domain - Adversarial Neural Networks (DANN) and their combinations. For a source cohort with males only and a source cohort with females only, domain-adaptation methods consistently showed improved performance than the no-adaptation baseline (source-only training), and the use of combinations of multiple domain adaptation methods delivered the largest and most stable gains. The method that combines MMD, CORAL, and DANN achieved the highest discrimination with the area under curve (AUC) of 0.88 for a source cohort with males only and 0.95 for a source cohort with females only), demonstrating that integrating multiple domain adaptation methods could produce feature representations that are less sensitive to dataset differences. Unlike existing methods that rely heavily on supervised tuning or assume known outcomes of samples in target cohorts, our outcome-free approaches enable the model selection under realistic deployment conditions and improve generalization of models in hip fracture risk prediction.