SRLCG: Self-Rectified Large-Scale Code Generation with Multidimensional Chain-of-Thought and Dynamic Backtracking

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Novice users struggle to assemble isolated code snippets generated by large language models (LLMs) into complete, multi-file software projects. Method: We propose the first end-to-end framework for project-level code generation, integrating multi-dimensional chain-of-thought (CoT) reasoning with a self-correction mechanism to model cross-file semantic consistency; we further design a dynamic backtracking algorithm that combines dependency-aware synthesis and constraint-driven refinement to ensure structural and logical robustness of the generated project. Contribution/Results: Our approach breaks from conventional single-snippet generation paradigms, enabling direct synthesis of buildable and executable multi-file projects from a single natural language prompt. Experiments show that our generated codebase is 15× larger than that produced by DeepSeek-V3 and 16× larger than that of GPT-4, while significantly outperforming existing CoT-based methods in correctness, completeness, and cross-file consistency.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have revolutionized code generation, significantly enhancing developer productivity. However, for a vast number of users with minimal coding knowledge, LLMs provide little support, as they primarily generate isolated code snippets rather than complete, large-scale project code. Without coding expertise, these users struggle to interpret, modify, and iteratively refine the outputs of LLMs, making it impossible to assemble a complete project. To address this issue, we propose Self-Rectified Large-Scale Code Generator (SRLCG), a framework that generates complete multi-file project code from a single prompt. SRLCG employs a novel multidimensional chain-of-thought (CoT) and self-rectification to guide LLMs in generating correct and robust code files, then integrates them into a complete and coherent project using our proposed dynamic backtracking algorithm. Experimental results show that SRLCG generates code 15x longer than DeepSeek-V3, 16x longer than GPT-4, and at least 10x longer than other leading CoT-based baselines. Furthermore, they confirm its improved correctness, robustness, and performance compared to baselines in large-scale code generation.

Problem

Research questions and friction points this paper is trying to address.

Generates complete multi-file project code from single prompt

Guides LLMs to produce correct and robust code files

Improves correctness and robustness in large-scale code generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multidimensional chain-of-thought guides code generation

Dynamic backtracking integrates files into complete projects

Self-rectification ensures correct and robust outputs

🔎 Similar Papers

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?