Joint Optimization of Training and Inference in Federated Edge Learning via Constrained Multi-Objective Deep Reinforcement Learning

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the joint optimization of federated training and inference on resource-constrained edge devices by formulating the problem as a multi-objective Markov decision process for the first time, explicitly accounting for inference accuracy—enhanced through data and model freshness—as well as latency and energy consumption. To tackle this challenge, the authors propose Constrained Multi-Objective Proximal Policy Optimization (C-MOPPO), which employs a sequential queuing mechanism to couple inference requests with training data and jointly optimizes device mode selection, communication, and computational resource allocation. Experimental results demonstrate that C-MOPPO consistently outperforms baseline methods across diverse system configurations, efficiently generating high-quality, dense Pareto-optimal solution sets that effectively balance the three competing objectives.

📝 Abstract

Federated edge learning (FEEL) has recently emerged as a promising paradigm for achieving edge intelligence (EI) via enabling collaborative model training across edge devices while protecting data privacy. In this paper, we put forth an online optimization framework that jointly manages federated training and inference on resource-constrained edge devices. We introduce a tandem-queue-inspired conversion mechanism that bridges inference requests and training data, and further incorporate both data and model freshness into the accuracy formulation to capture temporal dynamics in real-world environments. To maximize inference accuracy while minimizing latency and energy consumption, the mode selections, communication, and computation resource allocations of edge devices are jointly optimized. We formulate this optimization as a multi-objective optimization problem, which is NP-hard and further complicated by the online setting. To address these challenges, we transform the problem into a multi-objective Markov decision process (MOMDP) and develop a \underline{c}onstrained \underline{m}ulti-\underline{o}bjective \underline{p}roximal \underline{p}olicy \underline{o}ptimization (C-MOPPO) algorithm. Specifically, C-MOPPO first learns a set of policies with different preferences across three objectives, then leverages constrained policy optimization to enrich the Pareto front and obtain high-quality, dense solutions. Extensive experiments demonstrate that C-MOPPO achieves well-balanced trade-offs among objectives and significantly outperforms baselines under various system configurations.

Problem

Research questions and friction points this paper is trying to address.

Federated Edge Learning

Multi-Objective Optimization

Resource-Constrained Edge Devices

Inference Accuracy

Latency and Energy Trade-off

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Edge Learning

Multi-Objective Optimization

Constrained Policy Optimization