A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

📅 Unknown Date
📈 Citations: 0
Influential: 0
📄 PDF

career value

240K/year
🤖 AI Summary
This survey addresses the central challenge of leveraging deep learning for code understanding, generation, and optimization—termed Neural Code Intelligence (NCI). Methodologically, it integrates program analysis, natural language processing, deep learning, and large language model techniques, covering sequence modeling, pretraining-finetuning paradigms, multi-task evaluation, and semantic representation. Based on a systematic review of 680+ papers and 50+ representative models across 20+ task categories, the work establishes the first comprehensive historical taxonomy of NCI—from RNNs to modern LLMs—structured into four evolutionary paradigm stages. It further uncovers synergistic mechanisms and cross-domain integration pathways between code intelligence and general-purpose machine intelligence. The contributions include an authoritative, continuously updated resource repository (hosted on GitHub), a unified conceptual framework clarifying persistent technical bottlenecks (e.g., semantic fidelity, compositional generalization), and a pragmatic technology roadmap guiding both academic research and industrial deployment.

Technology Category

Application Category

📝 Abstract
Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronological review of the advancements in code intelligence, encompassing over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works. We follow the historical progression to trace the paradigm shifts across different research phases (e.g., from modeling code with recurrent neural networks to the era of Large Language Models). Concurrently, we highlight the major technical transitions in models, tasks, and evaluations spanning through different stages. For applications, we also observe a co-evolving shift. It spans from initial endeavors to tackling specific scenarios, through exploring a diverse array of tasks during its rapid expansion, to currently focusing on tackling increasingly complex and varied real-world challenges. Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains. Finally, we delve into both the opportunities and challenges associated with this field, alongside elucidating our insights on the most promising research directions. An ongoing, dynamically updated project and resources associated with this survey have been released at https://github.com/QiushiSun/Awesome-Code-Intelligence.
Problem

Research questions and friction points this paper is trying to address.

Neural Code Intelligence
Deep Learning
Code Processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Code Intelligence
Deep Learning Applications
Comprehensive Review
🔎 Similar Papers
No similar papers found.
Qiushi Sun
Qiushi Sun
The University of Hong Kong, National University of Singapore
Natural Language ProcessingAgentsCode Intelligence
Zhirui Chen
Zhirui Chen
National University of Singapore
Reinforcement LearningLarge Language Model3D Computer Vision
Fangzhi Xu
Fangzhi Xu
Xi'an Jiaotong University | Nanyang Technological University
Large Language ModelsSelf-TrainingReasoningGUI Agents
Kanzhi Cheng
Kanzhi Cheng
Nanjing University Ph.D Student
Vision-Language ModelsAI AgentsImage Captioning
C
Chang Ma
Shanghai AI Laboratory, Shanghai, China
Z
Zhangyue Yin
School of Computer Science, Fudan University, Shanghai, China
J
Jianing Wang
School of Data Science and Engineering, East China Normal University, Shanghai, China
Chengcheng Han
Chengcheng Han
Meituan | East China Normal University
NLPKG
Renyu Zhu
Renyu Zhu
NetEase Fuxi AI Lab | East China Normal University
S
Shuai Yuan
Shanghai AI Laboratory, Shanghai, China
Qipeng Guo
Qipeng Guo
Fudan University
X
Xipeng Qiu
School of Computer Science, Fudan University, Shanghai, China
Pengcheng Yin
Pengcheng Yin
Google Deepmind
Natural Language ProcessingAI for Code
X
Xiaoli Li
Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore; School of Computer Science and Engineering, Nanyang Technological University, Singapore
Fei Yuan
Fei Yuan
Minnesota State University, Mankato
remote sensingGISenvironmental monitoring and assessmentnatural resource mapping
Lingpeng Kong
Lingpeng Kong
Google DeepMind, The University of Hong Kong
Natural Language ProcessingMachine Learning
X
Xiang Li
School of Data Science and Engineering, East China Normal University, Shanghai, China
Z
Zhiyong Wu
Shanghai AI Laboratory, Shanghai, China