FCPO: Federated Continual Policy Optimization for Real-Time High-Throughput Edge Video Analytics

📅 2025-07-23

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

In edge video analytics (EVA), real-time, high-throughput scheduling across heterogeneous devices faces challenges including prolonged global decision cycles, poor scalability of local reinforcement learning (RL), and weak environmental adaptability. To address these, we propose the Federated Continual Policy Optimization (FCPO) framework, which integrates continual RL with federated RL. FCPO enables cross-device knowledge sharing and online environmental adaptation via personalized model aggregation and diversity-aware experience replay. It jointly optimizes batch size, input resolution, and multithreading policies. Evaluated on a real-world edge testbed, FCPO achieves over 5× higher system throughput, 60% lower end-to-end latency, 20% faster policy convergence, and up to 10× reduced memory overhead—significantly breaking the inherent trade-offs among convergence speed, generalization capability, and resource efficiency.

Technology Category

Application Category

📝 Abstract

The growing complexity of Edge Video Analytics (EVA) facilitates new kind of intelligent applications, but creates challenges in real-time inference serving systems. State-of-the-art (SOTA) scheduling systems optimize global workload distributions for heterogeneous devices but often suffer from extended scheduling cycles, leading to sub-optimal processing in rapidly changing Edge environments. Local Reinforcement Learning (RL) enables quick adjustments between cycles but faces scalability, knowledge integration, and adaptability issues. Thus, we propose FCPO, which combines Continual RL (CRL) with Federated RL (FRL) to address these challenges. This integration dynamically adjusts inference batch sizes, input resolutions, and multi-threading during pre- and post-processing. CRL allows agents to learn from changing Markov Decision Processes, capturing dynamic environmental variations, while FRL improves generalization and convergence speed by integrating experiences across inference models. FCPO combines these via an agent-specific aggregation scheme and a diversity-aware experience buffer. Experiments on a real-world EVA testbed showed over 5 times improvement in effective throughput, 60% reduced latency, and 20% faster convergence with up to 10 times less memory consumption compared to SOTA RL-based approaches.

Problem

Research questions and friction points this paper is trying to address.

Optimize real-time Edge Video Analytics scheduling for dynamic environments

Address scalability and adaptability issues in local Reinforcement Learning

Improve throughput and latency in heterogeneous Edge device workloads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Continual RL with Federated RL

Dynamically adjusts batch sizes and resolutions

Uses diversity-aware experience buffer

🔎 Similar Papers

No similar papers found.