🤖 AI Summary
Federated brain imaging analysis across institutions commonly suffers from both client-level and instance-level multimodal incompleteness (e.g., missing PET, MRI, or CT scans), undermining model robustness and generalizability. Method: We propose the first federated learning framework explicitly designed for modality-robust modeling under such incompleteness. Our approach introduces a FINCH clustering center pool to enable cross-client feature alignment, designs a missing-modality proxy mechanism and modality-aware federated aggregation strategy, and jointly optimizes the multimodal embedding space via supervised contrastive learning. Contribution/Results: Evaluated on the ADNI dataset (sMRI + PET), our method significantly outperforms state-of-the-art baselines under high modality missing rates. It ensures strict privacy preservation, scales efficiently to heterogeneous clients, and exhibits strong generalization across diverse data distributions—establishing a novel paradigm for real-world, multi-center collaborative analysis of neurological disorders.
📝 Abstract
Multimodal Federated Learning (MFL) has emerged as a promising approach for collaboratively training multimodal models across distributed clients, particularly in healthcare domains. In the context of brain imaging analysis, modality incompleteness presents a significant challenge, where some institutions may lack specific imaging modalities (e.g., PET, MRI, or CT) due to privacy concerns, device limitations, or data availability issues. While existing work typically assumes modality completeness or oversimplifies missing-modality scenarios, we simulate a more realistic setting by considering both client-level and instance-level modality incompleteness in this study. Building on this realistic simulation, we propose ClusMFL, a novel MFL framework that leverages feature clustering for cross-institutional brain imaging analysis under modality incompleteness. Specifically, ClusMFL utilizes the FINCH algorithm to construct a pool of cluster centers for the feature embeddings of each modality-label pair, effectively capturing fine-grained data distributions. These cluster centers are then used for feature alignment within each modality through supervised contrastive learning, while also acting as proxies for missing modalities, allowing cross-modal knowledge transfer. Furthermore, ClusMFL employs a modality-aware aggregation strategy, further enhancing the model's performance in scenarios with severe modality incompleteness. We evaluate the proposed framework on the ADNI dataset, utilizing structural MRI and PET scans. Extensive experimental results demonstrate that ClusMFL achieves state-of-the-art performance compared to various baseline methods across varying levels of modality incompleteness, providing a scalable solution for cross-institutional brain imaging analysis.