Hybrid Orchestration of Edge AI and Microservices via Graph-based Self-Imitation Learning

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of orchestrating heterogeneous request chains composed of AI services and traditional microservices in resource-constrained edge environments, where tightly coupled deployment and routing decisions render existing isolated optimization approaches ineffective for ensuring system performance. To this end, the paper introduces self-imitation learning into edge AI microservice orchestration for the first time, formulating hybrid orchestration as a sequential decision-making problem. It leverages a graph attention network to encode service topology and dependency relationships and integrates a self-imitation-enhanced proximal policy optimization (PPO) algorithm to jointly optimize deployment and routing strategies. The proposed method effectively explores high-reward trajectories in sparse-reward settings with large combinatorial action spaces, significantly reducing end-to-end latency and improving resource utilization, outperforming various heuristic, metaheuristic, and deep reinforcement learning baselines.

Technology Category

Application Category

📝 Abstract
Modern edge AI applications increasingly rely on microservice architectures that integrate both AI services and conventional microservices into complex request chains with stringent latency requirements. Effectively orchestrating these heterogeneous services is crucial for ensuring low-latency performance, yet remains challenging due to their diverse resource demands and strong operational interdependencies under resource-constrained edge environments. In particular, frequent interactions between services tightly couple deployment and routing decisions, yet existing approaches optimize them in isolation, leading to fundamentally inadequate system performance.In this paper, we propose SIL-GPO, a reinforcement learning framework that optimizes hybrid orchestration for edge AI microservice systems. SIL-GPO formulates the orchestration problem as a sequential decision-making task and leverages graph attention networks to encode service topologies and routing dependencies within the agent state representation. Moreover, SIL-GPO integrates a self-imitation learning strategy into proximal policy optimization, enabling the agent to prioritize and reuse high-reward trajectories. This guides policy updates towards globally promising solutions that standard RL often fails to discover under sparse rewards and large combinatorial action spaces. We conduct extensive experiments on trace-driven edge AI workloads, demonstrating that SIL-GPO significantly reduces end-to-end service latency and enhances resource utilization compared to state-of-the-art heuristic, metaheuristic, and deep RL baselines. Our framework offers a unified and scalable solution for efficient orchestration of AI services and microservices in the edge, paving the way for low-latency, high-performance edge AI deployments.
Problem

Research questions and friction points this paper is trying to address.

Edge AI
Microservices
Service Orchestration
Latency Optimization
Resource Constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based Self-Imitation Learning
Hybrid Orchestration
Edge AI Microservices
Graph Attention Networks
Proximal Policy Optimization
🔎 Similar Papers
No similar papers found.
Chen Yang
Chen Yang
The Hong Kong University of Science and Technology
Transfer learningMedical Image Analysis
Jin Zheng
Jin Zheng
Lecturer in Data Science, University of Bristol
Y
Yang Zhuolin
School of Computer Science, South-Central Minzu University, Wuhan, China, 430074, and Key Laboratory of Cyber-Physical Fusion Intelligent Computing, State Ethnic Affairs Commission.
L
Lai Pan
School of Computer Science, South-Central Minzu University, Wuhan, China, 430074, and Key Laboratory of Cyber-Physical Fusion Intelligent Computing, State Ethnic Affairs Commission.
Z
Zhang Xiao
School of Computer Science, South-Central Minzu University, Wuhan, China, 430074, and Key Laboratory of Cyber-Physical Fusion Intelligent Computing, State Ethnic Affairs Commission.
H
Hu Menglan
Hubei Key Laboratory of Smart Internet Technology, School of Electronic Information and Communication, Huazhong University of Science and Technology, Wuhan, China, 430074
Y
Yin Haiyan
Centre for Frontier AI Research, Agency for Science, Technology and Research (A*STAR), Singapore