ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation

📅 2024-05-27

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 1

career value

130K/year

🤖 AI Summary

Existing one-shot code generation methods suffer from insufficient complex reasoning capabilities. Method: This paper proposes a compiler-feedback-driven reflective sequence modeling paradigm. We design reflective self-distillation and dynamic masking distillation mechanisms to structure multi-step compiler feedback—such as error types, locations, and repair suggestions—into learnable reflective sequences, which are integrated into the supervised fine-tuning pipeline of large language models to enable sequence-level knowledge transfer and robustness enhancement. Contribution/Results: To our knowledge, this is the first work to explicitly model compiler feedback as a *reflective trajectory* guiding code generation, rather than merely using it as a post-hoc evaluation signal. The proposed dynamic masking mechanism effectively mitigates feedback noise and improves generalization. Our approach achieves state-of-the-art performance on HumanEval(+), MBPP(+), and MultiPL-E benchmarks, with significant pass-rate improvements and enhanced robustness against both syntactic errors and logical flaws.

Technology Category

Application Category

📝 Abstract

Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the compiler. Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedback to improve one-off code generation performance. Furthermore, we propose reflection self-distillation and dynamically masked distillation to effectively utilize these reflection sequences. Extensive experiments on three benchmarks, i.e., HumanEval (+), MBPP (+), and MultiPL-E, demonstrate that models fine-tuned with our method achieve state-of-the-art performance. Beyond the code domain, we believe this approach can benefit other domains that focus on final results and require long reasoning paths. Code and data are available at https://github.com/SenseLLM/ReflectionCoder.

Problem

Research questions and friction points this paper is trying to address.

Enhancing one-off code generation using reflection sequences

Improving performance with reflection self-distillation techniques

Achieving state-of-the-art results in code generation benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages reflection sequences from compiler feedback

Uses reflection self-distillation for better performance

Applies dynamically masked distillation techniques

🔎 Similar Papers

No similar papers found.