EAT: QoS-Aware Edge-Collaborative AIGC Task Scheduling via Attention-Guided Diffusion Reinforcement Learning

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of high QoS requirements, resource heterogeneity, cold-start latency, and the inherent trade-off between inference delay and generation quality in AIGC tasks at the edge, this paper proposes a collaborative scheduling algorithm based on attention-guided diffusion reinforcement learning. It is the first work to incorporate diffusion models into AIGC task scheduling: an attention mechanism dynamically perceives real-time load and queue states across heterogeneous edge servers, guiding a policy network to perform fine-grained task partitioning and cross-server collaborative inference, while enabling model reuse and adaptive load balancing. Experiments demonstrate that our approach reduces inference latency by up to 56% over baseline methods, significantly improves resource utilization and system throughput, and maintains high generation fidelity. This work establishes a novel paradigm for efficient, quality-aware AIGC service delivery in heterogeneous edge computing environments.

Technology Category

Application Category

📝 Abstract
The growth of Artificial Intelligence (AI) and large language models has enabled the use of Generative AI (GenAI) in cloud data centers for diverse AI-Generated Content (AIGC) tasks. Models like Stable Diffusion introduce unavoidable delays and substantial resource overhead, which are unsuitable for users at the network edge with high QoS demands. Deploying AIGC services on edge servers reduces transmission times but often leads to underutilized resources and fails to optimally balance inference latency and quality. To address these issues, this paper introduces a QoS-aware underline{E}dge-collaborative underline{A}IGC underline{T}ask scheduling (EAT) algorithm. Specifically: 1) We segment AIGC tasks and schedule patches to various edge servers, formulating it as a gang scheduling problem that balances inference latency and quality while considering server heterogeneity, such as differing model distributions and cold start issues. 2) We propose a reinforcement learning-based EAT algorithm that uses an attention layer to extract load and task queue information from edge servers and employs a diffusion-based policy network for scheduling, efficiently enabling model reuse. 3) We develop an AIGC task scheduling system that uses our EAT algorithm to divide tasks and distribute them across multiple edge servers for processing. Experimental results based on our system and large-scale simulations show that our EAT algorithm can reduce inference latency by up to 56% compared to baselines. We release our open-source code at https://github.com/zzf1955/EAT.
Problem

Research questions and friction points this paper is trying to address.

Minimizing AIGC task delays for edge users with high QoS demands
Optimizing edge server resource utilization and task scheduling efficiency
Balancing inference latency and quality in heterogeneous edge environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Edge-collaborative AIGC task scheduling algorithm
Attention-guided diffusion reinforcement learning
QoS-aware gang scheduling for edge servers
🔎 Similar Papers
No similar papers found.
Z
Zhifei Xu
Faculty of Arts and Sciences, Beijing Normal University, Zhuhai 519087, China and Institute of Artificial Intelligence and Future Networks, Beijing Normal University, Zhuhai 519087, China
Zhiqing Tang
Zhiqing Tang
Associate Professor, Beijing Normal University
Edge ComputingEdge AI SystemsContainerReinforcement Learning
Jiong Lou
Jiong Lou
Research Assistant Professor, Shanghai Jiao Tong University
Edge computingBlockchain
Z
Zhi Yao
School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China and Institute of Artificial Intelligence and Future Networks, Beijing Normal University, Zhuhai 519087, China
Xuan Xie
Xuan Xie
Macau University of Science and Technology
Trustworthy LLMCyber Physical SystemNeural Network Verification
T
Tian Wang
Institute of Artificial Intelligence and Future Networks, Beijing Normal University, Zhuhai 519087, China
Y
Yinglong Wang
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China
Weijia Jia
Weijia Jia
FIEEE, Chair Professor, Beijing Normal University and UIC
Cyber Intelligent ComputingNetworking