🤖 AI Summary
This work addresses the inefficiency of existing large language models in repository-level code generation, which stems from the absence of a mechanism to reuse task states across multiple attempts. To overcome this limitation, we propose LiveCoder, a novel framework that introduces persistent cross-attempt task states for the first time, unifying the modeling of both successful and failed experiences while maintaining a historically optimal repository structure. This transforms code generation into a continuous optimization process. Evaluated on RAL-Bench, LiveCoder achieves a relative improvement of up to 22.94 percentage points in functional correctness, attains a repository reuse rate of 81.58%, and reduces inference costs by 53.63%, all while preserving stable non-functional quality.
📝 Abstract
Large language models (LLMs) have achieved substantial progress in repository-level code generation. However, solving the same repository-level task often requires multiple attempts, while existing methods still optimize each attempt in isolation and do not preserve or reuse task-specific state across attempts. In this paper, we propose LiveCoder, a novel framework for repository-level code generation based on cross-attempt knowledge optimization. LiveCoder maintains persistent task-specific state from prior attempts to guide subsequent generation. This state includes success knowledge, which captures reusable signals from previously strong repositories, failure knowledge, which records unsuccessful outcomes and their diagnostic signals, and a historical-best repository, which preserves the strongest result found so far and prevents regression. These components collectively transform repeated repository generation into a persistent, knowledge-driven optimization process. We evaluate LiveCoder using four frontier LLMs on two representative repository-level code generation benchmarks. Extensive experimental results demonstrate the effectiveness and efficiency of LiveCoder, improving the functional score by up to 22.94 percentage points, increasing repository reuse to 81.58%, and reducing cost by up to 53.63% on RAL-Bench while maintaining broadly stable non-functional quality.