A Self-Improving Coding Agent

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of enabling coding agents to achieve closed-loop autonomous evolution using only foundational capabilities—code execution, tool invocation, self-reflection, and iterative code rewriting—without external human intervention or predefined update rules. We propose the first large language model–based programming agent capable of fully automated, open-ended self-modification, wherein the agent continuously refactors its own implementation to improve task performance. Our core contribution is formalizing self-modification as an executable, verifiable software engineering process—shifting embodied agents from reactive to evolutionary paradigms. Empirically, the agent achieves 17–53% absolute performance gains on the SWE-Bench Verified subset, and demonstrates consistent improvements on LiveCodeBench and synthetic agent benchmarks, validating both the efficacy and generalizability of self-evolution.

Technology Category

Application Category

📝 Abstract
We demonstrate that an LLM coding agent, equipped with basic coding tools, can autonomously edit itself, and thereby improve its performance on benchmark tasks. We find performance gains from 17% to 53% on a random subset of SWE Bench Verified, with additional performance gains on LiveCodeBench, as well as synthetically generated agent benchmarks. Our work represents an advancement in the automated and open-ended design of agentic systems, and provides a reference agent framework for those seeking to post-train LLMs on tool use and other agentic tasks.
Problem

Research questions and friction points this paper is trying to address.

Enabling LLM coding agents to autonomously edit themselves
Improving performance on benchmark coding tasks
Advancing automated design of agentic systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM coding agent autonomously edits itself
Performance gains up to 53% on benchmarks
Reference framework for post-training agentic tasks
🔎 Similar Papers
No similar papers found.