SE-GA: Memory-Augmented Self-Evolution for GUI Agents

📅 2026-05-16

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing autonomous GUI agents struggle with multi-step tasks due to limited context windows and static policies, hindering their adaptability in dynamic environments. This work proposes SE-GA, a self-evolving GUI agent that leverages a hierarchical memory architecture—comprising episodic, semantic, and experiential memory—together with test-time memory expansion (TTME) and memory-augmented self-evolution (MASE) mechanisms. These components enable dynamic context retrieval, long-horizon planning, and continual policy refinement. Experimental results demonstrate that SE-GA achieves success rates of 89.0% on ScreenSpot and 75.8% on AndroidControl-High, while significantly enhancing generalization performance in AndroidWorld.

📝 Abstract

Autonomous Graphical User Interface (GUI) agents often struggle with multi-step tasks due to constrained context windows and static policies that fail to adapt to dynamic environments. To address these limitations, this work proposes the Self-Evolving GUI Agent (SE-GA), a novel framework that integrates hierarchical memory structures with an iterative self-improvement mechanism. At the core of our approach is Test-Time Memory Extension (TTME), which facilitates long-term planning by dynamically retrieving episodic, semantic, and experiential memories to provide salient contexts during inference. To ensure continuous learning, we introduce Memory-Augmented Self-Evolution (MASE), which is a training pipeline that adopts the data collected by TTME to stabilize and enhance the agent's foundational policy. Extensive evaluations across both offline and online benchmarks demonstrate SE-GA achieves state-of-the-art performance, reaching success rates of 89.0\% on ScreenSpot and 75.8\% on the challenging AndroidControl-High dataset. Furthermore, significant improvements on the AndroidWorld benchmark highlight the superior generalization to dynamic environments. Open source code: https://github.com/jinshilong-dev/SE-GA

Problem

Research questions and friction points this paper is trying to address.

GUI agents

multi-step tasks

context windows

static policies

dynamic environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-Augmented Self-Evolution

Test-Time Memory Extension

GUI Agent

Hierarchical Memory

Continuous Learning

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

No related jobs found.

Authors to Follow