ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation

📅 2024-05-27
🏛️ arXiv.org
📈 Citations: 7
Influential: 1
📄 PDF

career value

179K/year
🤖 AI Summary
Existing one-shot code generation methods suffer from insufficient complex reasoning capabilities. Method: This paper proposes a compiler-feedback-driven reflective sequence modeling paradigm. We design reflective self-distillation and dynamic masking distillation mechanisms to structure multi-step compiler feedback—such as error types, locations, and repair suggestions—into learnable reflective sequences, which are integrated into the supervised fine-tuning pipeline of large language models to enable sequence-level knowledge transfer and robustness enhancement. Contribution/Results: To our knowledge, this is the first work to explicitly model compiler feedback as a *reflective trajectory* guiding code generation, rather than merely using it as a post-hoc evaluation signal. The proposed dynamic masking mechanism effectively mitigates feedback noise and improves generalization. Our approach achieves state-of-the-art performance on HumanEval(+), MBPP(+), and MultiPL-E benchmarks, with significant pass-rate improvements and enhanced robustness against both syntactic errors and logical flaws.

Technology Category

Application Category

📝 Abstract
Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the compiler. Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedback to improve one-off code generation performance. Furthermore, we propose reflection self-distillation and dynamically masked distillation to effectively utilize these reflection sequences. Extensive experiments on three benchmarks, i.e., HumanEval (+), MBPP (+), and MultiPL-E, demonstrate that models fine-tuned with our method achieve state-of-the-art performance. Beyond the code domain, we believe this approach can benefit other domains that focus on final results and require long reasoning paths. Code and data are available at https://github.com/SenseLLM/ReflectionCoder.
Problem

Research questions and friction points this paper is trying to address.

Enhancing one-off code generation using reflection sequences
Improving performance with reflection self-distillation techniques
Achieving state-of-the-art results in code generation benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages reflection sequences from compiler feedback
Uses reflection self-distillation for better performance
Applies dynamically masked distillation techniques
🔎 Similar Papers
No similar papers found.