Can Large Language Models Automate the Refinement of Cellular Network Specifications?

📅 2025-07-05

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

Manual review of 3GPP cellular network specifications is inefficient, and existing automated tools struggle to detect complex protocol vulnerabilities. Method: This paper pioneers the systematic application of large language models (LLMs) to automated standard improvement for cellular networks. Leveraging over 200,000 real-world 3GPP change requests, we construct CR-eval—a domain-specific dataset—and propose a communication-protocol-oriented vulnerability identification evaluation framework. We conduct a comprehensive benchmark of 16 mainstream LLMs, employing instruction tuning and domain adaptation. Contribution/Results: An 8B-parameter model detects over 127 security vulnerabilities across 30 attack scenarios within five attempts—matching GPT-4o’s performance. Our work establishes the first protocol-level LLM evaluation benchmark and demonstrates the feasibility of lightweight LLMs for fully automated protocol optimization, while identifying key technical bottlenecks and evolutionary pathways.

Technology Category

Application Category

📝 Abstract

Cellular networks serve billions of users globally, yet concerns about reliability and security persist due to weaknesses in 3GPP standards. However, traditional analysis methods, including manual inspection and automated tools, struggle with increasingly expanding cellular network specifications. This paper investigates the feasibility of Large Language Models (LLMs) for automated cellular network specification refinement. To advance it, we leverage 200,000+ approved 3GPP Change Requests (CRs) that document specification revisions, constructing a valuable dataset for domain tasks. We introduce CR-eval, a principled evaluation framework, and benchmark 16 state-of-the-art LLMs, demonstrating that top models can discover security-related weaknesses in over 127 out of 200 test cases within five trials. To bridge potential gaps, we explore LLM specialization techniques, including fine-tuning an 8B model to match or surpass advanced LLMs like GPT-4o and DeepSeek-R1. Evaluations on 30 cellular attacks identify open challenges for achieving full automation. These findings confirm that LLMs can automate the refinement of cellular network specifications and provide valuable insights to guide future research in this direction.

Problem

Research questions and friction points this paper is trying to address.

Automate refinement of cellular network specifications using LLMs

Address reliability and security weaknesses in 3GPP standards

Evaluate LLMs for detecting security-related specification gaps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverage 200,000+ 3GPP Change Requests for dataset

Introduce CR-eval framework to benchmark 16 LLMs

Fine-tune 8B model to surpass GPT-4o performance

🔎 Similar Papers

No similar papers found.