Outcome-Conditioned Reasoning Distillation for Resolving Software Issues

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the inefficiency and error-proneness of existing automated program repair methods in large codebases, which often neglect historical successful repairs, leading to redundant reasoning. The authors propose a result-conditioned backward reasoning distillation mechanism that, for the first time, reconstructs stepwise repair trajectories from verified patch outcomes without requiring fine-tuning or online search. This approach extracts transferable reasoning logic to guide fault localization and patch generation for new issues. Integrated with large language models, the method enables repair trajectory reconstruction, reasoning distillation, and file- or function-level localization and patch synthesis. Evaluated on SWE-Bench Lite, it significantly improves repair success rates, yielding absolute gains of 10.4%, 8.6%, and 10.3% with GPT-4o, DeepSeek-V3, and GPT-5, respectively.

Technology Category

Application Category

📝 Abstract

Software issue resolution in large repositories is a long-range decision process: choices made during localization shape the space of viable edits, and missteps can compound into incorrect patches. Despite this, many LLM-based repair pipelines still operate in a reset-and-solve manner, producing fresh reasoning for every new issue instead of carrying forward what worked in past fixes. This is wasteful because repositories routinely contain earlier issues with overlapping structure, failure modes, or constraints, where prior repair experience could provide useful guidance. Existing approaches typically harvest this signal through forward-time trial procedures, such as repeated refinement or search, incurring high inference cost while still risking divergence from the eventual correct patch. We present an Outcome-Conditioned Reasoning Distillation(O-CRD) framework that uses resolved in-repository issues with verified patches as supervision. Starting from a historical fix, the method reconstructs a stage-wise repair trace backward from the verified outcome, then reuses the distilled guidance at inference time to steer file/function localization and patch synthesis, without fine-tuning or online search. On SWE-Bench Lite, this approach increases Pass@1 by 10.4% with GPT-4o, 8.6% with DeepSeek-V3, and 10.3% with GPT-5, indicating that outcome-conditioned reuse of verified repairs can replace costly forward exploration for software issue resolution.

Problem

Research questions and friction points this paper is trying to address.

software issue resolution

large repositories

LLM-based repair

historical fixes

reasoning reuse

Innovation

Methods, ideas, or system contributions that make the work stand out.

Outcome-Conditioned Reasoning Distillation

software repair

reasoning distillation