Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing code agents rely on cumbersome manual workflows and custom tools, making performance heavily dependent on prompt engineering and human curation—thus obscuring LLMs’ intrinsic programming capabilities while incurring high maintenance overhead and data leakage risks. Method: We propose Lita (Lite Agent), a lightweight, unified agent framework grounded in the “politeness” design principle: it decouples task execution from orchestration, minimizes prompt engineering, and eliminates manual intervention. Leveraging our derived Agent Complexity Law, we theoretically establish that as model capability improves, the performance gap between simple and complex agents narrows. Results: End-to-end evaluation on Aider Polyglot and SWE-Bench demonstrates that Lita matches or surpasses state-of-the-art workflow-based agents across multiple frontier models, while reducing token consumption and development/maintenance costs significantly. This work provides the first systematic empirical validation that lightweight agents effectively unlock LLMs’ native programming competence.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly being applied to programming tasks, ranging from single-turn code completion to autonomous agents. Current code agent designs frequently depend on complex, hand-crafted workflows and tool sets. However, this reliance on elaborate scaffolding presents several challenges: agent performance becomes overly dependent on prompt tuning and custom design choices, heavy human intervention obscures a model's true underlying capabilities, and intricate pipelines are costly to build and maintain. Furthermore, optimizing complex task prompts increases the risk of data leakage. Currently, when introducing new models, LLM providers like OpenAI and Anthropic often publish benchmark scores to demonstrate their models' coding proficiency, but keep their proprietary evaluation frameworks confidential. To address these limitations, we introduce Lita (Lite Agent), which operationalizes liteness, a principle of minimizing manual design while retaining the essential elements of a fully autonomous agent. Lita enables a more faithful and unified evaluation without elaborate scaffolding. Experiments on the Aider Polyglot and SWE-Bench with frontier models demonstrate that Lita achieves competitive or superior performance compared to workflow-based and agentic baselines. Crucially, Lita also consumes fewer tokens and requires significantly less design effort. Our results suggest that Lita is sufficient to reveal the underlying coding competence of modern LLMs. Finally, we propose the Agent Complexity Law: the performance gap between agents of varying complexity, from simple to sophisticated designs, will shrink as the core model improves, ultimately converging to a negligible difference.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' coding capabilities without complex workflows

Reducing manual design dependency in autonomous code agents

Minimizing token consumption while maintaining competitive performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lita minimizes manual design for autonomous agents

It enables faithful evaluation without complex scaffolding

Lita consumes fewer tokens than workflow-based agents

🔎 Similar Papers

A Survey on Large Language Model based Autonomous Agents