Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

To address the challenge of balancing quality and cost when deploying LLM-based agents at the network edge, this paper proposes NetGPT, a cloud-edge collaborative optimization framework. Methodologically, it introduces (1) a network-aware unified fallback threshold mechanism that dynamically adjusts routing decisions based on the monotonic dependence of bandwidth and round-trip time (RTT); and (2) a schema-preserving reinforcement learning method, integrating SFT-anchored composite objective functions with backward KL-divergence–constrained trust-region updates to jointly optimize routing policies and edge-side models. Experimental results demonstrate that NetGPT significantly reduces cloud offloading rates while maintaining high task success rates and structural correctness. Crucially, it enables smooth, controllable trade-offs between quality and cost under dynamic network conditions, outperforming state-of-the-art baselines in both efficiency and reliability.

Technology Category

Application Category

📝 Abstract

Large language model (LLM) agents at the network edge offer low-latency execution for routine queries. In contrast, complex requests often require the superior capability of cloud models, incurring higher latency and cost. To navigate this quality-cost trade-off under dynamic network conditions, we propose a cloud-edge synergy for NetGPT that integrates network-aware routing with on-edge self-improvement. Specifically, our framework routes structured tool-calling requests to cloud or edge agents via a novel scoring policy. We prove that, under mild regularity assumptions, the optimal routing rule admits a unique fallback threshold with monotone dependence on bandwidth and round-trip time (RTT). Concurrently, based on the dataset collected from requests routed to the cloud and corresponding responses, we instantiate a schema-preserving reinforcement learning (RL) to improve the capability of the edge agent. We analyze a supervised finetuning (SFT)-anchored composite objective that combines a reverse-KL trust-region step with a forward-KL realignment toward the SFT prior, explaining stability and constraining policy drift. Both the network-aware routing policy and the edge agent are updated coherently. Experiments across controlled network states and pricing schedules demonstrate smooth quality-cost frontiers, consistent gains of dynamic fallback thresholds over fixed policies, and sustained reductions in offloading while maintaining task success and schema-correct outputs.

Problem

Research questions and friction points this paper is trying to address.

Balancing latency and cost for LLM queries between edge and cloud

Routing complex requests optimally under dynamic network conditions

Improving edge agent capability while maintaining output consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cloud-edge synergy with network-aware routing

Reinforcement learning for edge agent self-improvement

Dynamic fallback thresholds based on network conditions

🔎 Similar Papers

An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer