๐ค AI Summary
To address weak cross-task generalization and low communication efficiency in one-shot federated learning, this paper proposes a non-interfering attention maskingโdriven global prompt optimization framework. Methodologically: (1) an attention isolation mechanism is introduced to suppress excessive interaction between prompts and text embeddings; (2) a cross-island collaborative optimization module is designed to align and decentralize aggregate multi-source visual knowledge; (3) only lightweight prompt parameters are fine-tuned atop frozen pretrained models. Crucially, the framework achieves strong generalization without iterative inter-client communication. Evaluated on ten benchmark datasets across both class-level and domain-level generalization tasks, it consistently outperforms eight state-of-the-art methods, delivering significant improvements in performance and practicality for one-shot federated prompt learning.
๐ Abstract
Federated Prompt Learning (FPL) enables communication-efficient adaptation by tuning lightweight prompts on top of frozen pre-trained models. Existing FPL methods typically rely on global information, which is only available after the second training round, to facilitate collaboration among client models. Therefore, they are inherently dependent on multi-round communication to fully exhibit their strengths. Moreover, existing one-shot federated learning methods typically focus on fitting seen tasks, but lack cross-task generalization. To bridge this gap, we propose the Global Prompt Refinement with Non-Interfering Attention Masking (GPR-NIAM) method for one-shot FPL. The core idea is to design a masking mechanism that restricts excessive interaction between the original text embeddings and the learnable prompt embeddings. GPR-NIAM achieves this through the collaboration of two key modules. Firstly, the attention isolation module suppresses attention from the learnable prompt tokens to the original text tokens, and reweights the reverse attention which preserves generalization across tasks. Secondly, the cross-silo collaborative refinement module integrates decentralized visual knowledge into a unified base and calibrates the global prompt through multi-source cross-modal knowledge alignment, further mitigating the inconsistency caused by data heterogeneity. Extensive experiments conducted on ten benchmark datasets under two tasks show that GPR-NIAM outperforms eight state-of-the-art methods in both class-level and domain-level generalization.