🤖 AI Summary
In federated learning, non-participating clients face challenges in deploying personalized models under intra-domain distribution shifts and resource constraints; existing approaches rely on fine-tuning and exhibit poor generalization. This paper proposes a zero-fine-tuning dynamic model generation framework: it is the first to introduce hypernetworks into the non-participation setting, integrating distribution-aware embeddings and a NoisyEmbed-enhanced extractor to prevent feature collapse; and it designs a balanced penalty mechanism for module-wise generation of lightweight, task-specific submodels. The method significantly reduces communication and storage overhead, outperforming state-of-the-art methods across multiple datasets and model architectures. Ablation studies and visualization analyses validate the effectiveness of each component. Overall, this work establishes a new paradigm for efficient, personalized model deployment on resource-constrained edge devices with unknown and heterogeneous data distributions.
📝 Abstract
Federated Learning (FL) has emerged as a promising paradigm for privacy-preserving collaborative learning, yet data heterogeneity remains a critical challenge. While existing methods achieve progress in addressing data heterogeneity for participating clients, they fail to generalize to non-participating clients with in-domain distribution shifts and resource constraints. To mitigate this issue, we present HyperFedZero, a novel method that dynamically generates specialized models via a hypernetwork conditioned on distribution-aware embeddings. Our approach explicitly incorporates distribution-aware inductive biases into the model's forward pass, extracting robust distribution embeddings using a NoisyEmbed-enhanced extractor with a Balancing Penalty, effectively preventing feature collapse. The hypernetwork then leverages these embeddings to generate specialized models chunk-by-chunk for non-participating clients, ensuring adaptability to their unique data distributions. Extensive experiments on multiple datasets and models demonstrate HyperFedZero's remarkable performance, surpassing competing methods consistently with minimal computational, storage, and communication overhead. Moreover, ablation studies and visualizations further validate the necessity of each component, confirming meaningful adaptations and validating the effectiveness of HyperFedZero.