🤖 AI Summary
To address the weak generalization of federated learning (FL) under non-IID and cross-domain settings, as well as the inherent trade-off between privacy preservation and communication efficiency, this paper proposes a client-adaptive focal modulation framework tailored for multimodal, resource-constrained environments. Methodologically, it introduces: (1) a task-aware client embedding mechanism that dynamically generates personalized modulation strategies; (2) a lightweight modulation layer generation approach based on low-rank hypernetworks, drastically reducing communication overhead; and (3) a unified training paradigm integrating Transformer architectures with cross-modal adaptive optimization. Extensive experiments across eight benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches in both source-free federated and cross-task scenarios. It achieves superior generalization capability, high communication efficiency, and strong scalability—without compromising model utility or privacy guarantees.
📝 Abstract
Federated learning (FL) has proven essential for privacy-preserving, collaborative training across distributed clients. Our prior work, TransFed, introduced a robust transformer-based FL framework that leverages a learn-to-adapt hypernetwork to generate personalized focal modulation layers per client, outperforming traditional methods in non-IID and cross-domain settings. In this extended version, we propose AdaptFED, where we deepen the investigation of focal modulation in generalizable FL by incorporating: (1) a refined adaptation strategy that integrates task-aware client embeddings to personalize modulation dynamics further, (2) enhanced theoretical bounds on adaptation performance, and (3) broader empirical validation across additional modalities, including time-series and multilingual data. We also introduce an efficient variant of TransFed that reduces server-client communication overhead via low-rank hypernetwork conditioning, enabling scalable deployment in resource-constrained environments. Extensive experiments on eight diverse datasets reaffirm the superiority of our method over state-of-the-art baselines, particularly in source-free and cross-task federated setups. Our findings not only extend the capabilities of focal modulation in FL but also pave the way for more adaptive, scalable, and generalizable transformer-based federated systems. The code is available at http://github.com/Tajamul21/TransFed