🤖 AI Summary
In federated learning, clients frequently encounter instance-level partial modality missing (e.g., occluded image regions or missing text fields), exacerbating statistical heterogeneity; existing methods address only complete modality absence and fail to handle such fine-grained missingness. This work is the first to systematically tackle this challenge, proposing a cross-modal aggregation and contrastive regularization framework. We design a cross-modal feature compensation aggregation mechanism to achieve robust feature alignment under heterogeneous missing patterns, and introduce latent-space contrastive regularization to mitigate modality collapse. The approach synergistically integrates federated learning, multimodal representation learning, contrastive learning, and adaptive missingness modeling. Evaluated under strong statistical heterogeneity and high missingness ratios, our method achieves up to 26% accuracy improvement over state-of-the-art baselines, significantly enhancing model generalization and robustness to partial modality missing.
📝 Abstract
Federated Learning (FL) is a method for training machine learning models using distributed data sources. It ensures privacy by allowing clients to collaboratively learn a shared global model while storing their data locally. However, a significant challenge arises when dealing with missing modalities in clients’ datasets, where certain features or modalities are unavailable or incomplete, leading to heterogeneous data distribution. While previous studies have addressed the issue of complete-modality missing1, they fail to tackle partial-modality missing2 on account of severe heterogeneity among clients at an instance level, where the pattern of missing data can vary significantly from one sample to another. To tackle this challenge, this study proposes a novel framework named FedMAC, designed to address multimodality missing under conditions of partial-modality missing in FL. Additionally, to avoid trivial aggregation of multi-modal features, we introduce contrastive-based regularization to impose additional constraints on the latent representation space. The experimental results demonstrate the effectiveness of FedMAC across various client configurations with statistical heterogeneity, outperforming baseline methods by up to 26% in severe missing scenarios, highlighting its potential as a solution for the challenge of partially missing modalities in federated systems.1Complete missing is when one or more modalities are absent in server and clients’ data.2Partial missing is when only parts of one or more modalities are absent in server and clients’ data