🤖 AI Summary
Malicious xApps in the O-RAN Shared Data Layer (SDL) exploit Unicode-based steganography (hypoglyphs) to launch data manipulation attacks, evading and even crashing conventional machine learning–based anomaly detectors (e.g., Autoencoders).
Method: This paper proposes the first LLM-based robust detection framework, leveraging prompt engineering to enhance large language models’ capability to parse and discriminate malformed Unicode inputs, enabling near-real-time anomaly identification in SDL messages.
Contribution/Results: The proposed LLM-based xApp detector achieves a latency of only 0.07 seconds, exhibits high operational stability and strong resilience against adversarial interference. Although its initial accuracy requires further improvement, empirical evaluation validates the feasibility and innovative potential of LLMs for O-RAN security monitoring—establishing a novel paradigm for defending against data poisoning–style attacks in open RAN architectures.
📝 Abstract
The introduction of 5G and the Open Radio Access Network (O-RAN) architecture has enabled more flexible and intelligent network deployments. However, the increased complexity and openness of these architectures also introduce novel security challenges, such as data manipulation attacks on the semi-standardised Shared Data Layer (SDL) within the O-RAN platform through malicious xApps. In particular, malicious xApps can exploit this vulnerability by introducing subtle Unicode-wise alterations (hypoglyphs) into the data that are being used by traditional machine learning (ML)-based anomaly detection methods. These Unicode-wise manipulations can potentially bypass detection and cause failures in anomaly detection systems based on traditional ML, such as AutoEncoders, which are unable to process hypoglyphed data without crashing. We investigate the use of Large Language Models (LLMs) for anomaly detection within the O-RAN architecture to address this challenge. We demonstrate that LLM-based xApps maintain robust operational performance and are capable of processing manipulated messages without crashing. While initial detection accuracy requires further improvements, our results highlight the robustness of LLMs to adversarial attacks such as hypoglyphs in input data. There is potential to use their adaptability through prompt engineering to further improve the accuracy, although this requires further research. Additionally, we show that LLMs achieve low detection latency (under 0.07 seconds), making them suitable for Near-Real-Time (Near-RT) RIC deployments.