🤖 AI Summary
This work addresses the suboptimal return on investment in 5G networks amid rapidly growing AI compute demands by proposing the AI-RAN architecture, which enhances infrastructure capital efficiency. It establishes a techno-economic analysis framework on heterogeneous platforms (x86 + GPU) that uniquely integrates 5G physical-layer processing benchmarks, realistic wireless traffic models, and large language model inference requirements. By dynamically repurposing idle GPU resources during low-traffic periods to serve AI workloads, the framework quantifies associated costs and revenues. Experimental results across diverse deployment scenarios demonstrate that the additional revenue generated by AI-RAN can fully offset the incremental GPU costs, achieving a return on investment as high as eightfold and substantially improving both infrastructure utilization and economic viability.
📝 Abstract
The large-scale deployment of 5G networks has not delivered the expected return on investment for mobile network operators, raising concerns about the economic viability of future 6G rollouts. At the same time, surging demand for Artificial Intelligence (AI) inference and training workloads is straining global compute capacity. AI-RAN architectures, in which Radio Access Network (RAN) platforms accelerated on Graphics Processing Unit (GPU) share idle capacity with AI workloads during off-peak periods, offer a potential path to improved capital efficiency. However, the economic case for such systems remains unsubstantiated. In this paper, we present a techno-economic analysis of AI-RAN deployments by combining publicly available benchmarks of 5G Layer-1 processing on heterogeneous platforms -- from x86 servers with accelerators for channel coding to modern GPUs -- with realistic traffic models and AI service demand profiles for Large Language Model (LLM) inference. We construct a joint cost and revenue model that quantifies the surplus compute capacity available in GPU-based RAN deployments and evaluates the returns from leasing it to AI tenants. Our results show that, across a range of scenarios encompassing token depreciation, varying demand dynamics, and diverse GPU serving densities, the additional capital and operational expenditures of GPU-heavy deployments are offset by AI-on-RAN revenue, yielding a return on investment of up to 8x. These findings strengthen the long-term economic case for accelerator-based RAN architectures and future 6G deployments.