AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degraded task completion performance of existing long-horizon GUI agents, which often stems from inefficient memory mechanisms leading to redundant replays or loss of critical dependency information. The study presents the first systematic diagnosis of this memory failure and introduces Anchored State Memory (ASM), a novel approach that compresses interaction trajectories into structured intermediate state anchors through causal dependency analysis, thereby enabling subgoal retrieval and attribution-aware decision-making. To support this investigation, the authors develop the AndroTMem diagnostic framework and a new benchmark, AndroTMem-Bench, comprising 1,069 tasks with strong causal dependencies. Experiments across 12 GUI agents demonstrate that ASM consistently outperforms full-sequence replay and summarization baselines, improving task completion rates by 5%–30.16% and average memory scores by 4.93%–24.66%.

Technology Category

Application Category

📝 Abstract
Long-horizon GUI agents are a key step toward real-world deployment, yet effective interaction memory under prevailing paradigms remains under-explored. Replaying full interaction sequences is redundant and amplifies noise, while summaries often erase dependency-critical information and traceability. We present AndroTMem, a diagnostic framework for anchored memory in long-horizon Android GUI agents. Its core benchmark, AndroTMem-Bench, comprises 1,069 tasks with 34,473 interaction steps (avg. 32.1 per task, max. 65). We evaluate agents with TCR (Task Complete Rate), focusing on tasks whose completion requires carrying forward critical intermediate state; AndroTMem-Bench is designed to enforce strong step-to-step causal dependencies, making sparse yet essential intermediate states decisive for downstream actions and centering interaction memory in evaluation. Across open- and closed-source GUI agents, we observe a consistent pattern: as interaction sequences grow longer, performance drops are driven mainly by within-task memory failures, not isolated perception errors or local action mistakes. Guided by this diagnosis, we propose Anchored State Memory (ASM), which represents interaction sequences as a compact set of causally linked intermediate-state anchors to enable subgoal-targeted retrieval and attribution-aware decision making. Across multiple settings and 12 evaluated GUI agents, ASM consistently outperforms full-sequence replay and summary-based baselines, improving TCR by 5%-30.16% and AMS by 4.93%-24.66%, indicating that anchored, structured memory effectively mitigates the interaction-memory bottleneck in long-horizon GUI tasks. The code, benchmark, and related resources are publicly available at [https://github.com/CVC2233/AndroTMem](https://github.com/CVC2233/AndroTMem).
Problem

Research questions and friction points this paper is trying to address.

long-horizon GUI agents
interaction memory
intermediate state
causal dependencies
memory bottleneck
Innovation

Methods, ideas, or system contributions that make the work stand out.

anchored memory
long-horizon GUI agents
interaction trajectories
causal dependencies
Anchored State Memory (ASM)
🔎 Similar Papers
No similar papers found.
Y
Yibo Shi
XJTU
J
Jungang Li
HKUST(GZ)
L
Linghao Zhang
CityU
Z
Zihao Dongfang
HKUST(GZ)
Biao Wu
Biao Wu
Artificial intelligence , University of Technology Sydney
Multimodal Information Processing
S
Sicheng Tao
HKUST(GZ)
Yibo Yan
Yibo Yan
East China Normal University
High-dimensional Statistics
C
Chenxi Qin
TJU
W
Weiting Liu
FDU
Z
Zhixin Lin
SDU
Hanqian Li
Hanqian Li
M.Phil @HKUST(GZ)
Computer VisionLarge Language ModelNatural Language Processing
Yu Huang
Yu Huang
HKUST(GZ)
Trustworthy AI
S
Song Dai
HKUST(GZ)
Y
Yonghua Hei
HKUST(GZ)
Y
Yue Ding
CASIA
X
Xiang Li
HKUST(GZ)
S
Shikang Wang
CityU
C
Chengdong Xu
SYSU
J
Jingqi Liu
XJTU
X
Xueying Ma
XJTU
Z
Zhiwen Zheng
XJTU
Xiaofei Zhang
Xiaofei Zhang
University of Memphis
Database SystemsGraph Algorithms & PracticesDistributed & Parallel Computing
B
Bincheng Wang
NWPU
N
Nichen Yang
XJTU
J
Jie Wu
SYSU