HELO-APR: Enhancing Low-Resource Program Repair through Cross-Lingual Knowledge Transfer

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

134K/year

🤖 AI Summary

This work addresses the limited performance of large language models in automated program repair for low-resource programming languages, which stems from the scarcity of high-quality bug-fix pairs. To overcome this challenge, the authors propose HELO-APR, a novel framework that systematically transfers repair knowledge from high-resource to low-resource languages. The approach first synthesizes syntactically idiomatic and defect-consistent training data for the target language, then employs curriculum learning to progressively align and adapt cross-lingual repair patterns. By integrating cross-lingual code synthesis, defect consistency preservation, and fine-tuning of large models, HELO-APR achieves substantial improvements on Ruby and Rust: Pass@1 increases by up to 17.33 percentage points, compilation success rate rises from 49.77% to 91.98%, and both BLEU-4 and ROUGE-1 scores significantly outperform baseline methods.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) perform well on automatic program repair (APR) for high-resource programming languages (HRPLs), but their effectiveness drops sharply in low-resource programming languages (LRPLs), due to a lack of sufficient verified buggy-fixed pairs for APR training. To address this challenge, we propose HELO-APR (High-resource Enabled LOw-resource APR), a two-stage APR framework that enables cross-lingual transfer of repair knowledge from HRPLs to LRPLs. HELO-APR (1) constructs high-quality LRPL training data by synthesizing LRPL buggy-fixed pairs from HRPL counterparts, preserving defect type consistency while ensuring the synthesized code is idiomatic, and then (2) adopts a curriculum learning strategy that progressively performs HRPL repair learning, cross-lingual repair alignment, and LRPL repair adaptation, improving repair effectiveness in LRPLs. Using C++ as the source HRPL and Ruby and Rust as the target LRPLs, experiments on xCodeEval show that HELO-APR consistently outperforms strong baselines, increasing Pass@1 from 31.32% to 48.65% on DeepSeek-Coder-6.7B and from 1.67% to 11.97% on CodeLlama-7B, while improving syntactic validity by raising the average target compilation rate on CodeLlama from 49.77% to 91.98%. On Defects4Ruby, HELO-APR increases BLEU-4 from 61.20 to 66.79 and ROUGE-1 from 76.76 to 83.59 on CodeLlama-7B, indicating higher similarity to developer patches in real-world settings. Finally, we conduct ablation studies to assess the necessity of each core component. These results suggest that verified cross-lingual supervision provides a reusable approach for improving LLM-based repair in low-resource languages.

Problem

Research questions and friction points this paper is trying to address.

low-resource programming languages

automatic program repair

cross-lingual knowledge transfer

buggy-fixed pairs

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-lingual knowledge transfer

low-resource program repair

synthetic data generation