Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Multimodal federated learning faces significant challenges due to client-side modality heterogeneity and inconsistent model architectures, which hinder feature alignment, incur high communication costs, and compromise robustness. To address these issues, this work proposes CoMFed, a novel framework that introduces, for the first time, a latent-space consensus mechanism. CoMFed employs learnable projection matrices to generate compact cross-modal latent representations and incorporates a latent-space regularization term to align representations across clients. This approach preserves data privacy while substantially reducing both communication and computational overhead, and enhances robustness against outliers. Experimental results demonstrate that CoMFed achieves competitive accuracy on human activity recognition benchmarks.

Technology Category

Application Category

📝 Abstract

Federated learning (FL) enables collaborative model training across distributed devices without sharing raw data, but applying FL to multi-modal settings introduces significant challenges. Clients typically possess heterogeneous modalities and model architectures, making it difficult to align feature spaces efficiently while preserving privacy and minimizing communication costs. To address this, we introduce CoMFed, a Communication-Efficient Multi-Modal Federated Learning framework that uses learnable projection matrices to generate compressed latent representations. A latent-space regularizer aligns these representations across clients, improving cross-modal consistency and robustness to outliers. Experiments on human activity recognition benchmarks show that CoMFed achieves competitive accuracy with minimal overhead.

Problem

Research questions and friction points this paper is trying to address.

Federated Learning

Multi-Modal Learning

Communication Efficiency

Feature Space Alignment

Heterogeneous Clients

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal federated learning

latent-space consensus

communication efficiency