🤖 AI Summary
Large language models (LLMs) exhibit weak long-range reasoning, limited contextual understanding, and insufficient compositional reasoning capabilities in complex programming tasks. Method: This paper proposes a multi-agent collaborative, guidance-driven code generation framework. Its core innovation is a novel fine-grained task decomposition mechanism that repositions the LLM as a fuzzy retriever—not an end-to-end generator—and establishes a multi-role agent system built upon quantized Llama 3.1 8B, integrating task planning, progressive subproblem decomposition, dynamic verification, and feedback-driven correction. Contribution/Results: Evaluated on HumanEval, the framework achieves a 23.79% absolute accuracy improvement over standard single-pass generation. It significantly enhances LLMs’ practicality, robustness, and interpretability in real-world software development scenarios, addressing critical limitations in structured program synthesis.
📝 Abstract
Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning abilities. This paper introduces a novel agentic framework for ``guided code generation'' that tries to address these limitations through a deliberately structured, fine-grained approach to code generation tasks. Our framework leverages LLMs' strengths as fuzzy searchers and approximate information retrievers while mitigating their weaknesses in long sequential reasoning and long-context understanding. Empirical evaluation using OpenAI's HumanEval benchmark with Meta's Llama 3.1 8B model (int4 precision) demonstrates a 23.79% improvement in solution accuracy compared to direct one-shot generation. Our results indicate that structured, guided approaches to code generation can significantly enhance the practical utility of LLMs in software development while overcoming their inherent limitations in compositional reasoning and context handling.