🤖 AI Summary
To address the explosive demand for edge AI in 6G networks, this work tackles the limitation of conventional RAN architectures—designed only for AI-assisted optimization—by enabling native support for distributed AI workloads.
Method: (1) We extend the O-RAN SMO framework with a lightweight AI-RAN orchestrator for cross-domain orchestration of communication and AI resources; (2) we design distributed AI-RAN sites featuring multi-tier latency awareness and geographically precise scheduling; (3) leveraging modular, cloud-native Open RAN, we enable co-deployment of real-time and batch AI tasks alongside multi-vendor interoperability.
Contribution/Results: This is the first architecture to provide native AI compute support atop RAN infrastructure—without requiring new hardware—thus repurposing existing investments. It transforms the RAN from a connectivity pipeline into an edge intelligence-enabling platform, significantly enhancing telecom operators’ AI monetization capabilities.
📝 Abstract
The proliferation of data-intensive Artificial Intelligence (AI) applications at the network edge demands a fundamental shift in RAN design, from merely consuming AI for network optimization, to actively enabling distributed AI workloads. This paradigm shift presents a significant opportunity for network operators to monetize AI at the edge while leveraging existing infrastructure investments. To realize this vision, this article presents a novel converged O-RAN and AI-RAN architecture that unifies orchestration and management of both telecommunications and AI workloads on shared infrastructure. The proposed architecture extends the Open RAN principles of modularity, disaggregation, and cloud-nativeness to support heterogeneous AI deployments. We introduce two key architectural innovations: (i) the AI-RAN Orchestrator, which extends the O-RAN Service Management and Orchestration (SMO) to enable integrated resource and allocation across RAN and AI workloads; and (ii) AI-RAN sites that provide distributed edge AI platforms with real-time processing capabilities. The proposed system supports flexible deployment options, allowing AI workloads to be orchestrated with specific timing requirements (real-time or batch processing) and geographic targeting. The proposed architecture addresses the orchestration requirements for managing heterogeneous workloads at different time scales while maintaining open, standardized interfaces and multi-vendor interoperability.