Generalizable Federated Learning using Client Adaptive Focal Modulation

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the weak generalization of federated learning (FL) under non-IID and cross-domain settings, as well as the inherent trade-off between privacy preservation and communication efficiency, this paper proposes a client-adaptive focal modulation framework tailored for multimodal, resource-constrained environments. Methodologically, it introduces: (1) a task-aware client embedding mechanism that dynamically generates personalized modulation strategies; (2) a lightweight modulation layer generation approach based on low-rank hypernetworks, drastically reducing communication overhead; and (3) a unified training paradigm integrating Transformer architectures with cross-modal adaptive optimization. Extensive experiments across eight benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches in both source-free federated and cross-task scenarios. It achieves superior generalization capability, high communication efficiency, and strong scalability—without compromising model utility or privacy guarantees.

Technology Category

Application Category

📝 Abstract

Federated learning (FL) has proven essential for privacy-preserving, collaborative training across distributed clients. Our prior work, TransFed, introduced a robust transformer-based FL framework that leverages a learn-to-adapt hypernetwork to generate personalized focal modulation layers per client, outperforming traditional methods in non-IID and cross-domain settings. In this extended version, we propose AdaptFED, where we deepen the investigation of focal modulation in generalizable FL by incorporating: (1) a refined adaptation strategy that integrates task-aware client embeddings to personalize modulation dynamics further, (2) enhanced theoretical bounds on adaptation performance, and (3) broader empirical validation across additional modalities, including time-series and multilingual data. We also introduce an efficient variant of TransFed that reduces server-client communication overhead via low-rank hypernetwork conditioning, enabling scalable deployment in resource-constrained environments. Extensive experiments on eight diverse datasets reaffirm the superiority of our method over state-of-the-art baselines, particularly in source-free and cross-task federated setups. Our findings not only extend the capabilities of focal modulation in FL but also pave the way for more adaptive, scalable, and generalizable transformer-based federated systems. The code is available at http://github.com/Tajamul21/TransFed

Problem

Research questions and friction points this paper is trying to address.

Enhancing generalization in federated learning via client-adaptive focal modulation

Improving adaptation performance with task-aware embeddings and theoretical bounds

Reducing communication overhead for scalable deployment in resource-constrained settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Client-adaptive focal modulation layers

Task-aware client embeddings integration

Low-rank hypernetwork conditioning efficiency

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Research Engineer, Monetization AI