π€ AI Summary
This work addresses the limitation of existing programming tutoring systems, which often repair learnersβ code errors without explaining their underlying causes, thereby hindering personalized instruction. To bridge this gap, the paper introduces the novel task of Learner-oriented Program Repair (LPR) and proposes a two-stage, end-to-end framework that integrates retrieval augmentation with large language models. The approach first employs edit-driven retrieval to identify relevant repair exemplars and then leverages these retrieved solutions to guide the generation of both accurate repairs and interpretable error explanations. An iterative retrieval mechanism is further incorporated to continuously refine repair quality. Experimental results demonstrate that the proposed method significantly outperforms multiple baselines in both repair accuracy and pedagogical explainability, confirming its effectiveness and practicality in programming education contexts.
π Abstract
With the development of large language models (LLMs) in the field of programming, intelligent programming coaching systems have gained widespread attention. However, most research focuses on repairing the buggy code of programming learners without providing the underlying causes of the bugs. To address this gap, we introduce a novel task, namely LRP (Learner-Tailored Program Repair). We then propose a novel and effective framework, LSGEN (Learner-Tailored Solution Generator), to enhance program repair while offering the bug descriptions for the buggy code. In the first stage, we utilize a repair solution retrieval framework to construct a solution retrieval database and then employ an edit-driven code retrieval approach to retrieve valuable solutions, guiding LLMs in identifying and fixing the bugs in buggy code. In the second stage, we propose a solution-guided program repair method, which fixes the code and provides explanations under the guidance of retrieval solutions. Moreover, we propose an Iterative Retrieval Enhancement method that utilizes evaluation results of the generated code to iteratively optimize the retrieval direction and explore more suitable repair strategies, improving performance in practical programming coaching scenarios. The experimental results show that our approach outperforms a set of baselines by a large margin, validating the effectiveness of our framework for the newly proposed LPR task.