Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications

📅 2024-09-09

🏛️ arXiv.org

📈 Citations: 6

✨ Influential: 1

career value

203K/year

🤖 AI Summary

To address the suboptimal performance of general-purpose large language models (LLMs) on domain-specific tasks—such as telecom terminology comprehension and mathematical representation—this paper introduces Tele-LLM, the first open-source family of specialized LLMs (1B–8B parameters) tailored for the telecommunications domain. Methodologically, we construct Tele-Data, a high-quality telecom-specific corpus, and Tele-Eval, a multi-dimensional benchmark covering technical reasoning, domain knowledge, and real-world document understanding. Leveraging parameter-efficient fine-tuning, multi-granularity domain knowledge partitioning, and scalable training strategies, our approach jointly optimizes domain adaptation and general-language capabilities while mitigating catastrophic forgetting. Experiments demonstrate that Tele-LLM significantly outperforms mainstream general-purpose LLMs on both Tele-Eval and practical telecom literature tasks, without compromising general linguistic competence. All models, datasets, and training code are publicly released.

Technology Category

Application Category

📝 Abstract

The emergence of large language models (LLMs) has significantly impacted various fields, from natural language processing to sectors like medicine and finance. However, despite their rapid proliferation, the applications of LLMs in telecommunications remain limited, often relying on general-purpose models that lack domain-specific specialization. This lack of specialization results in underperformance, particularly when dealing with telecommunications-specific technical terminology and their associated mathematical representations. This paper addresses this gap by first creating and disseminating Tele-Data, a comprehensive dataset of telecommunications material curated from relevant sources, and Tele-Eval, a large-scale question-and-answer dataset tailored to the domain. Through extensive experiments, we explore the most effective training techniques for adapting LLMs to the telecommunications domain, ranging from examining the division of expertise across various telecommunications aspects to employing parameter-efficient techniques. We also investigate how models of different sizes behave during adaptation and analyze the impact of their training data on this behavior. Leveraging these findings, we develop and open-source Tele-LLMs, the first series of language models ranging from 1B to 8B parameters, specifically tailored for telecommunications. Our evaluations demonstrate that these models outperform their general-purpose counterparts on Tele-Eval and telecommunications-related literature tasks while retaining their previously acquired capabilities, thus avoiding the catastrophic forgetting phenomenon.

Problem

Research questions and friction points this paper is trying to address.

Lack of specialized LLMs for telecommunications domain

Underperformance with telecom-specific terminology and math

Need for domain-adapted training techniques and datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops Tele-Data and Tele-Eval domain-specific datasets

Explores parameter-efficient LLM training techniques

Creates Tele-LLMs series (1B-8B) for telecom tasks

🔎 Similar Papers

No similar papers found.