Rethinking Agent Design: From Top-Down Workflows to Bottom-Up Skill Evolution

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current LLM-based agents rely on manually designed task decomposition and workflows, hindering autonomous learning and evolution from experience. This paper proposes a bottom-up agent paradigm wherein agents learn end-to-end from raw pixel inputs and mouse actions, engaging in trial-and-error exploration, outcome reflection, and abstraction of reusable skills within open-ended game environments. The core innovation is a novel skill evolution mechanism—enabling autonomous skill distillation, cross-agent sharing, and incremental skill expansion—without manual workflow specification or domain-specific prompting. The framework is game-agnostic, requires no privileged APIs, and achieves zero-shot, cross-task skill acquisition in *Slay the Spire* and *Civilization V* without human intervention. Empirical results demonstrate its feasibility and scalability in highly complex, open-ended environments.

Technology Category

Application Category

📝 Abstract
Most LLM-based agent frameworks adopt a top-down philosophy: humans decompose tasks, define workflows, and assign agents to execute each step. While effective on benchmark-style tasks, such systems rely on designer updates and overlook agents' potential to learn from experience. Recently, Silver and Sutton(2025) envision a shift into a new era, where agents could progress from a stream of experiences. In this paper, we instantiate this vision of experience-driven learning by introducing a bottom-up agent paradigm that mirrors the human learning process. Agents acquire competence through a trial-and-reasoning mechanism-exploring, reflecting on outcomes, and abstracting skills over time. Once acquired, skills can be rapidly shared and extended, enabling continual evolution rather than static replication. As more agents are deployed, their diverse experiences accelerate this collective process, making bottom-up design especially suited for open-ended environments. We evaluate this paradigm in Slay the Spire and Civilization V, where agents perceive through raw visual inputs and act via mouse outputs, the same as human players. Using a unified, game-agnostic codebase without any game-specific prompts or privileged APIs, our bottom-up agents acquire skills entirely through autonomous interaction, demonstrating the potential of the bottom-up paradigm in complex, real-world environments. Our code is available at https://github.com/AngusDujw/Bottom-Up-Agent.
Problem

Research questions and friction points this paper is trying to address.

Shifting from top-down to bottom-up agent design for learning
Enabling agents to learn skills autonomously through experience
Evaluating bottom-up agents in complex, open-ended game environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bottom-up agent paradigm mimics human learning
Trial-and-reasoning mechanism for skill acquisition
Game-agnostic codebase enables autonomous interaction
🔎 Similar Papers
No similar papers found.