🤖 AI Summary
To address high communication overhead and inefficient parameter upload in federated multilingual neural machine translation (NMT), this paper proposes MetaSend, a meta-learning-driven adaptive parameter selection method. Methodologically, MetaSend introduces client-wise tensor deviation modeling into federated NMT communication compression—marking the first such integration—and enables lossless, dynamic threshold adaptation, overcoming limitations of conventional fixed pruning or quantization schemes. It synergistically combines meta-learning, gradient/parameter sparsification, and federation-aware design to enhance convergence efficiency and generalization under stringent communication budgets. Experiments on multilingual benchmarks demonstrate that MetaSend significantly outperforms baselines—including FedAvg and FedProx—in BLEU score, while exhibiting strong robustness to non-IID language distributions across clients.
📝 Abstract
Federated learning (FL) is a promising approach for solving multilingual tasks, potentially enabling clients with their own language-specific data to collaboratively construct a high-quality neural machine translation (NMT) model. However, communication constraints in practical network systems present challenges for exchanging large-scale NMT engines between FL parties. In this paper, we propose a meta-learning-based adaptive parameter selection methodology, MetaSend, that improves the communication efficiency of model transmissions from clients during FL-based multilingual NMT training. Our approach learns a dynamic threshold for filtering parameters prior to transmission without compromising the NMT model quality, based on the tensor deviations of clients between different FL rounds. Through experiments on two NMT datasets with different language distributions, we demonstrate that MetaSend obtains substantial improvements over baselines in translation quality in the presence of a limited communication budget.