Federated Prompt-Tuning with Heterogeneous and Incomplete Multimodal Client Data

📅 2026-02-06
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of semantic misalignment in federated learning caused by heterogeneous multimodal client data and missing input-level features. To tackle this issue, the paper proposes the first federated multimodal prompt tuning framework, which enables collaborative optimization and effective fusion of prompt instructions across clients and modalities. The approach integrates client-specific prompt tuning with a server-side semantic-aware aggregation mechanism, establishing a novel paradigm that supports prompt alignment and aggregation under heterogeneous missing-data patterns. By innovatively combining federated learning with multimodal prompt tuning, the method achieves significant performance gains over state-of-the-art baselines across multiple multimodal benchmark datasets, demonstrating its effectiveness and robustness.

Technology Category

Application Category

📝 Abstract
This paper introduces a generalized federated prompt-tuning framework for practical scenarios where local datasets are multi-modal and exhibit different distributional patterns of missing features at the input level. The proposed framework bridges the gap between federated learning and multi-modal prompt-tuning which have traditionally focused on either uni-modal or centralized data. A key challenge in this setting arises from the lack of semantic alignment between prompt instructions that encode similar distributional patterns of missing data across different clients. To address this, our framework introduces specialized client-tuning and server-aggregation designs that simultaneously optimize, align, and aggregate prompt-tuning instructions across clients and data modalities. This allows prompt instructions to complement one another and be combined effectively. Extensive evaluations on diverse multimodal benchmark datasets demonstrate that our work consistently outperforms state-of-the-art (SOTA) baselines.
Problem

Research questions and friction points this paper is trying to address.

federated learning
prompt-tuning
multimodal data
missing features
heterogeneous data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Prompt-Tuning
Heterogeneous Multimodal Data
Incomplete Modalities
Prompt Alignment
Cross-Client Aggregation
🔎 Similar Papers
T
Thu Hang Phung
Institute of AI Innovation and Societal Impact, Hanoi University of Science and Technology
Duong M. Nguyen
Duong M. Nguyen
Phd Student, CS @ University of Illinois at Urbana-Champaign
Machine LearningComputer VisionFederated LearningGenerative Model
T
Thanh Trung Huynh
EPFL
Q
Quoc Viet Hung Nguyen
Griffith University
Trong Nghia Hoang
Trong Nghia Hoang
Assistant Professor, Washington State University
Machine LearningFederated LearningMeta LearningModel FusionGaussian Processes
P
Phi Le Nguyen
Institute of AI Innovation and Societal Impact, Hanoi University of Science and Technology