🤖 AI Summary
This work addresses the inefficiencies in developing and deploying AI applications (xApps/rApps) in O-RAN and the inability of large language models (LLMs) to meet the stringent real-time and deterministic inference requirements of radio access networks. To bridge this gap, the authors propose the Dual-Brain architecture, which uniquely integrates an LLM’s natural language understanding and code generation capabilities with NeuralSmith—a lightweight AutoML engine capable of on-demand model training—to enable an end-to-end pipeline from natural language intent to automated AI service deployment. Serving as an orchestrator, the LLM coordinates with a containerized O-RAN 5G standalone testbed via an API-driven framework, demonstrating full automation of data collection, model training, and deployment in a real-world environment. This approach significantly enhances AI service provisioning efficiency while ensuring both security and practicality.
📝 Abstract
The Open Radio Access Network (O-RAN) architecture allows AI to be embedded directly into the RAN through modular xApps and rApps, yet creating these applications collecting data, training models, writing code, and deploying them safely remains slow and largely manual. Large Language Models (LLMs) offer strong reasoning and code-generation capabilities but are unsuited for the fast, deterministic inference required in real-time RAN control. We present a proof-of-concept Dual-Brain architecture that combines both strengths: an LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while an automated ML engine, NeuralSmith, trains lightweight classifiers on demand via an API. We describe the architecture and provisioning workflow, share practical insights from a containerized O-RAN 5G~SA testbed, and discuss open research directions.