Conditional Imputation for Within-Modality Missingness in Multi-Modal Federated Learning

๐Ÿ“… 2026-04-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

204K/year
๐Ÿค– AI Summary
This work addresses the challenge of intra-modal temporal data missingness in multimodal federated learning, which commonly arises from intermittent sensor availability or irregular sampling and disrupts input semantic structure, thereby degrading model robustness. To tackle this issue, the authors propose CondI, a novel framework that introduces conditional diffusion models into federated learning to explicitly model the distribution of missing data. CondI employs a two-stage training strategy: first performing context-aware conditional imputation leveraging multimodal information, followed by joint optimization of modality-specific feature extractors and a shared embedding space. Experimental results demonstrate that CondI achieves state-of-the-art performance across three clinical datasetsโ€”PTB-XL, SLEEP-EDF, and MIMIC-IVโ€”and significantly enhances model stability under severe missingness scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative training, but real-world clinical applications often suffer from within-modality missingness caused by sensor intermittency or irregular sampling. Existing methods implicitly represent unobserved data via architectural alignment or missing embeddings, often failing to recover the true distribution and yielding sub-optimal performance. We propose CondI, a federated framework explicitly addressing this missingness using conditional diffusion models. CondI employs a two-phase training pipeline: first, imputing unobserved temporal components using available multimodal context and conditional embeddings; second, optimizing modality-specific extractors and joint embedding spaces. During inference, imputed raw data pass through trained extractors to generate robust features, providing a holistic representation for downstream tasks. Explicit data imputation ensures models operate on complete semantic structures, significantly enhancing resilience against severe data incompleteness. Experiments on three clinical datasets (PTB-XL, SLEEP-EDF, MIMIC-IV) demonstrate CondI achieves comparable results to state-of-the-art baselines. Code: https://github.com/ZhengWugeng/CondI
Problem

Research questions and friction points this paper is trying to address.

within-modality missingness
multimodal federated learning
data imputation
conditional diffusion models
clinical data incompleteness
Innovation

Methods, ideas, or system contributions that make the work stand out.

conditional diffusion models
within-modality missingness
multimodal federated learning
explicit imputation
two-phase training
๐Ÿ”Ž Similar Papers