FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation

📅 2026-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of insufficient client personalization performance in multimodal federated learning, which arises from heterogeneous data modalities, task discrepancies, and inconsistent model architectures across clients. To tackle this issue, the authors propose FedAFD, a unified framework that integrates a dual-level adversarial alignment and granularity-aware fusion module at the client side, along with a similarity-guided ensemble knowledge distillation mechanism at the server. By jointly leveraging adversarial alignment, adaptive fusion, and distillation strategies, FedAFD effectively bridges modality and task gaps while accommodating model heterogeneity. Experimental results demonstrate that FedAFD consistently outperforms existing methods under both IID and non-IID settings, significantly enhancing both client-level personalization and global model efficiency.

Technology Category

Application Category

📝 Abstract
Multimodal Federated Learning (MFL) enables clients with heterogeneous data modalities to collaboratively train models without sharing raw data, offering a privacy-preserving framework that leverages complementary cross-modal information. However, existing methods often overlook personalized client performance and struggle with modality/task discrepancies, as well as model heterogeneity. To address these challenges, we propose FedAFD, a unified MFL framework that enhances client and server learning. On the client side, we introduce a bi-level adversarial alignment strategy to align local and global representations within and across modalities, mitigating modality and task gaps. We further design a granularity-aware fusion module to integrate global knowledge into the personalized features adaptively. On the server side, to handle model heterogeneity, we propose a similarity-guided ensemble distillation mechanism that aggregates client representations on shared public data based on feature similarity and distills the fused knowledge into the global model. Extensive experiments conducted under both IID and non-IID settings demonstrate that FedAFD achieves superior performance and efficiency for both the client and the server.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Federated Learning
modality discrepancy
task discrepancy
model heterogeneity
personalized performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Federated Learning
Adversarial Alignment
Granularity-aware Fusion
Similarity-guided Distillation
Model Heterogeneity
🔎 Similar Papers
No similar papers found.
Min Tan
Min Tan
Professor of School of Computer Science and Technology, Hangzhou Dianzi University
Machine LearningImage ProcessingMultimediaComputer Vision
J
Junchao Ma
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University
Y
Yinfu Feng
Alibaba International Digital Commerce Group
J
Jiajun Ding
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University
W
Wenwen Pan
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University
T
Tingting Han
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University
Q
Qian Zheng
College of Computer Science and Technology, Zhejiang University
Z
Zhenzhong Kuang
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University
Z
Zhou Yu
Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University