🤖 AI Summary
To address low accuracy and high memory overhead in 3GPP standard question answering for telecom intelligent services, this paper proposes a lightweight, open-source RAG framework tailored for 6G. Methodologically, it introduces the first domain-specific hybrid retrieval paradigm for telecommunications—jointly leveraging official 3GPP documentation and web search results; designs a dictionary-enhanced query refinement mechanism to improve technical term understanding; and incorporates a lightweight neural router for dynamic retrieval path selection. The framework is fully compatible with open-source LLMs and features an optimized RAG pipeline. Key contributions include: (1) the first efficient, 3GPP-specialized QA framework; (2) significant improvements in technical QA accuracy (+17.6%) and terminology query performance (+10.6%); and (3) a 45% reduction in memory footprint. The open-sourced model achieves GPT-4–level performance on telecom-specific benchmarks.
📝 Abstract
Artificial intelligence will be one of the key pillars of the next generation of mobile networks (6G), as it is expected to provide novel added-value services and improve network performance. In this context, large language models have the potential to revolutionize the telecom landscape through intent comprehension, intelligent knowledge retrieval, coding proficiency, and cross-domain orchestration capabilities. This paper presents Telco-oRAG, an open-source Retrieval-Augmented Generation (RAG) framework optimized for answering technical questions in the telecommunications domain, with a particular focus on 3GPP standards. Telco-oRAG introduces a hybrid retrieval strategy that combines 3GPP domain-specific retrieval with web search, supported by glossary-enhanced query refinement and a neural router for memory-efficient retrieval. Our results show that Telco-oRAG improves the accuracy in answering 3GPP-related questions by up to 17.6% and achieves a 10.6% improvement in lexicon queries compared to baselines. Furthermore, Telco-oRAG reduces memory usage by 45% through targeted retrieval of relevant 3GPP series compared to baseline RAG, and enables open-source LLMs to reach GPT-4-level accuracy on telecom benchmarks.