DeepForm: Reasoning Large Language Model for Communication System Formulation

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address three critical bottlenecks in automating 6G communication system modeling—lack of domain-specific knowledge in general-purpose large language models (LLMs), insufficient formal reasoning capability, and scarcity of high-quality training data—this paper introduces the first domain-specific LLM tailored for formalized reasoning over communication system equations. We innovatively construct the Communication System Formalized Reasoning Corpus (CSFRC) and propose C-ReMax, a rule-driven reinforcement learning algorithm enabling self-correction and formal verification. Our two-stage training paradigm comprises chain-of-thought supervised fine-tuning (SFT) followed by rule-augmented reinforcement learning, underpinned by a customized domain reasoning architecture. Experiments demonstrate that our model significantly outperforms larger closed-source LLMs across diverse communication modeling tasks, achieving state-of-the-art performance in the domain. All data, models, and code are publicly released.

Technology Category

Application Category

📝 Abstract

Communication system formulation is critical for advancing 6G and future wireless technologies, yet it remains a complex, expertise-intensive task. While Large Language Models (LLMs) offer potential, existing general-purpose models often lack the specialized domain knowledge, nuanced reasoning capabilities, and access to high-quality, domain-specific training data required for adapting a general LLM into an LLM specially for communication system formulation. To bridge this gap, we introduce DeepForm, the first reasoning LLM specially for automated communication system formulation. We propose the world-first large-scale, open-source dataset meticulously curated for this domain called Communication System Formulation Reasoning Corpus (CSFRC). Our framework employs a two-stage training strategy: first, Supervised Fine-Tuning (SFT) with Chain-of-Thought (CoT) data to distill domain knowledge; second, a novel rule-based Reinforcement Learning (RL) algorithm, C-ReMax based on ReMax, to cultivate advanced modeling capabilities and elicit sophisticated reasoning patterns like self-correction and verification. Extensive experiments demonstrate that our model achieves state-of-the-art performance, significantly outperforming larger proprietary LLMs on diverse senerios. We will release related resources to foster further research in this area after the paper is accepted.

Problem

Research questions and friction points this paper is trying to address.

Specialized LLM for complex 6G communication system design

Lack of domain-specific training data for LLMs

Need advanced reasoning capabilities in wireless technology formulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Specialized LLM for communication system formulation

Two-stage training with SFT and rule-based RL

Novel dataset CSFRC for domain-specific reasoning

🔎 Similar Papers

Horae: A Domain-Agnostic Modeling Language for Automating Multimodal Service Regulation⋆