Deadline-Driven Hierarchical Agentic Resource Sharing for AI Services and RAN Functions in AI-RAN

๐Ÿ“… 2026-05-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

240K/year
๐Ÿค– AI Summary
This work addresses the challenge of co-scheduling real-time radio access network (RAN) functions and heterogeneous AI services on a unified edge infrastructure, where mismatched scheduling timescales and service migration disruptions hinder performance. To tackle this, the authors propose a hierarchical agent framework (HAF) that integrates a large language model (LLM) agent for slow-timescale service placement decisions and introduces a closed-form, deadline-aware convex optimization algorithm for fast-timescale GPU/CPU resource allocation. A predictive critic mechanism is further incorporated to dynamically suppress inefficient migrations, enabling effective cross-timescale coordination. Experimental results demonstrate that the proposed approach achieves a 90.0% service-level objective (SLO) compliance rateโ€”20.5% higher than the strongest baselineโ€”and improves AI service request satisfaction from 51% to 85.3%, consistently outperforming existing methods across diverse load scenarios.
๐Ÿ“ Abstract
AI-RAN consolidates AI services and Radio Access Network (RAN) functions onto a unified, GPU-accelerated infrastructure at the network edge. However, compute sharing between real-time RAN functions and highly heterogeneous AI services requires coordination of scheduling decisions at mismatched timescales, and placement adaptation may require service migration across nodes with non-negligible interruptions. This paper proposes a hierarchical agentic framework (HAF) for compute sharing in AI-RAN that combines a large language model (LLM)-based agent for slow-timescale placement of AI services and RAN functions with a closed-form, deadline-aware convex algorithm for fast-timescale GPU/CPU allocation. The LLM agent is further equipped with a predictive critic that filters out migrations when the induced service interruption outweighs the expected service-level objective (SLO) benefit. Experimental results show that HAF reaches 90.0% overall SLO fulfillment, a 20.5% improvement over the strongest baseline, and raises AI service request fulfillment from 51% to 85.3%. Further evaluations show that HAF retains its advantage under diverse load conditions, while the critic consistently improves SLO fulfillment across multiple open-source LLM agents.
Problem

Research questions and friction points this paper is trying to address.

AI-RAN
resource sharing
deadline-driven
service migration
SLO fulfillment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Agentic Framework
Deadline-Aware Resource Allocation
AI-RAN
Predictive Critic
LLM-Based Scheduling
๐Ÿ”Ž Similar Papers
No similar papers found.