🤖 AI Summary
This work addresses the generalization–convergence trade-off in over-the-air computation (AirComp)-enabled federated meta-learning for wireless edge AI, under channel distortion. We propose the first unified AirComp-based federated meta-learning framework and theoretically establish that wireless channel noise induces an implicit regularization effect: it enhances generalization to unseen users or tasks while slowing convergence. Furthermore, we derive the first theoretical trade-off model characterizing the interplay between convergence rate and generalization performance in AirComp-based federated meta-learning. Experimental results under realistic wireless channel conditions demonstrate a 12.7% improvement in generalization accuracy with no more than an 18% increase in required communication rounds, thereby validating both the theoretical analysis and practical efficacy of the proposed framework.
📝 Abstract
For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learning (FL) implementations. Meta-learning provides a general framework in which pre-training and fine-tuning can be formalized. Meta-learning-based personalized FL (meta-pFL) moves beyond basic personalization by targeting generalization to new agents and tasks. This paper studies the generalization performance of meta-pFL for a wireless setting in which the agents participating in the pre-training phase, i.e., meta-learning, are connected via a shared wireless channel to the server. Adopting over-the-air computing, we study the trade-off between generalization to new agents and tasks, on the one hand, and convergence, on the other hand. The trade-off arises from the fact that channel impairments may enhance generalization, while degrading convergence. Extensive numerical results validate the theory.