🤖 AI Summary
Existing static human-curated datasets are difficult to scale, while raw interaction logs suffer from high noise and sparse information, limiting model training efficacy. This work proposes Echo, a novel framework that, for the first time, models user revision sequences of AI-generated outputs as a knowledge distillation process, establishing a general paradigm for extracting high-quality training signals from real-world interaction data. By integrating refined sequence extraction, continual learning, and log alignment techniques, Echo effectively overcomes the bottlenecks imposed by static datasets. Evaluated in a production code-completion environment, the approach significantly improves model acceptance rates from 25.7% to 35.7%, demonstrating its capability for continuous improvement and practical effectiveness.
📝 Abstract
Static "human data" faces inherent limitations: it is expensive to scale and bounded by the knowledge of its creators. Continuous learning from "experience data" - interactions between agents and their environments - promises to transcend these barriers. Today, the widespread deployment of AI agents grants us low-cost access to massive streams of such real-world experience. However, raw interaction logs are inherently noisy, filled with trial-and-error and low information density, rendering them inefficient for direct model training.
We introduce Echo, a generalized framework designed to operationalize the transition from raw experience to learnable knowledge, effectively "echoing" environmental feedback back into the training loop for model optimization. In today's agent ecosystem, user refinement serves as a primary source of such feedback: driven by responsibility for the outcome, users rigorously transform flawed agent proposals into verified solutions. These user-driven refinement sequences inherently distill agents' crude attempts into high-quality training signals. Echo systematically harvests these signals to continuously align the agent with real-world needs. Large-scale validation in a production code completion environment confirms that Echo effectively harnesses this pipeline, breaking the static performance ceiling by increasing the acceptance rate from 25.7% to 35.7%.