Towards Advancing Code Generation with Large Language Models: A Research Roadmap

📅 2025-01-20

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address the low reliability, poor robustness, and limited engineering deployability of large language models (LLMs) in real-world programming scenarios, this paper proposes the first process-oriented code generation framework encompassing six stages: input understanding, task orchestration, code development, verification, debugging, and refinement. Through systematic literature analysis and challenge attribution modeling, we identify for the first time the structural deficiencies of LLMs and LLM-based agents in software engineering practice. We design a multi-dimensional evaluation framework and establish a reproducible research paradigm with practical implementation guidelines. Our contributions provide theoretical foundations and actionable pathways for designing, evaluating, and optimizing industrial-grade LLM-powered code generation systems—significantly enhancing the practical utility and deployment feasibility of generated code. (136 words)

Technology Category

Application Category

📝 Abstract

Recently, we have witnessed the rapid development of large language models, which have demonstrated excellent capabilities in the downstream task of code generation. However, despite their potential, LLM-based code generation still faces numerous technical and evaluation challenges, particularly when embedded in real-world development. In this paper, we present our vision for current research directions, and provide an in-depth analysis of existing studies on this task. We propose a six-layer vision framework that categorizes code generation process into distinct phases, namely Input Phase, Orchestration Phase, Development Phase, and Validation Phase. Additionally, we outline our vision workflow, which reflects on the currently prevalent frameworks. We systematically analyse the challenges faced by large language models, including those LLM-based agent frameworks, in code generation tasks. With these, we offer various perspectives and actionable recommendations in this area. Our aim is to provide guidelines for improving the reliability, robustness and usability of LLM-based code generation systems. Ultimately, this work seeks to address persistent challenges and to provide practical suggestions for a more pragmatic LLM-based solution for future code generation endeavors.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Code Generation

Programming Tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Code Generation Framework

Systematic Reliability Enhancement

🔎 Similar Papers

A Survey on Evaluating Large Language Models in Code Generation Tasks