MAGNET: Towards Adaptive GUI Agents with Memory-Driven Knowledge Evolution

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the limited generalization of mobile GUI agents under frequent UI updates and workflow reconfigurations, which stems from their reliance on historical data. To overcome this, the authors propose MAGNET, a novel framework that decouples static functional semantics from task intent for the first time. MAGNET aligns multimodal visual features with stable semantics through a static memory module, captures universal task intentions across workflows via a procedural memory component, and introduces a visit-frequency-driven dynamic memory evolution mechanism to continuously refine and generalize knowledge amid interface changes. Integrated with a large language model agent and fine-tuned through online reinforcement learning, MAGNET significantly outperforms baseline methods on the AndroidWorld online benchmark and demonstrates progressively enhanced robustness and adaptability under offline distribution shifts.

Technology Category

Application Category

📝 Abstract

Mobile GUI agents powered by large foundation models enable autonomous task execution, but frequent updates altering UI appearance and reorganizing workflows cause agents trained on historical data to fail. Despite surface changes, functional semantics and task intents remain fundamentally stable. Building on this insight, we introduce MAGNET, a memory-driven adaptive agent framework with dual-level memory: stationary memory linking diverse visual features to stable functional semantics for robust action grounding and procedural memory capturing stable task intents across varying workflows. We propose a dynamic memory evolution mechanism that continuously refines both memories by prioritizing frequently accessed knowledge. Online benchmark AndroidWorld evaluations show substantial improvements over baselines, while offline benchmarks confirm consistent gains under distribution shifts. These results validate that leveraging stable structures across interface changes improves agent performance and generalization in evolving software environments.

Problem

Research questions and friction points this paper is trying to address.

GUI agents

interface changes

distribution shift

task generalization

software evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

memory-driven adaptation

functional semantics

task intent