HiconAgent: History Context-aware Policy Optimization for GUI Agents

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

GUI agents face challenges in sequential navigation tasks due to redundant historical context and high computational overhead. To address this, we propose HCPO (History-Aware Contextual Policy Optimization), a novel framework that jointly integrates dynamic context sampling with anchor-guided history compression. Specifically, action anchors enable precise retention and structured modeling of critical historical information. Methodologically, HCPO introduces dynamic graph-structured sampling, a dual-branch policy network, and a history-enhanced alignment loss to significantly improve historical utilization efficiency. Experiments demonstrate that HiconAgent-3B—built upon HCPO—achieves state-of-the-art performance on GUI-Odyssey, outperforming GUI-R1-7B by 8.46% in task accuracy and 11.32% in step success rate. On AndroidControl and AITW, it matches or exceeds prior methods while accelerating inference by 2.47× and reducing computational cost by 60%.

Technology Category

Application Category

📝 Abstract

Graphical User Interface (GUI) agents require effective use of historical context to perform sequential navigation tasks. While incorporating past actions and observations can improve decision making, naive use of full history leads to excessive computational overhead and distraction from irrelevant information. To address this, we introduce HiconAgent, a GUI agent trained with History Context-aware Policy Optimization (HCPO) for efficient and effective utilization of historical information. HCPO optimizes history usage in both sampling and policy updates through two complementary components: (1) Dynamic Context Sampling (DCS) presents the agent with variable length histories during sampling, enabling adaptive use of the most relevant context; (2) Anchor-guided History Compression (AHC) refines the policy update phase with a dual branch strategy where the compressed branch removes history observations while keeping history actions as information flow anchors. The compressed and uncompressed branches are coupled through a history-enhanced alignment loss to enforce consistent history usage while maintaining efficiency. Experiments on mainstream GUI navigation benchmarks demonstrate strong performance. Despite being smaller, HiconAgent-3B outperforms GUI-R1-7B by +8.46 percent grounding accuracy and +11.32 percent step success rate on GUI-Odyssey, while achieving comparable results on AndroidControl and AITW with up to 2.47x computational speedup and 60 percent FLOPs reduction.

Problem

Research questions and friction points this paper is trying to address.

Optimizes GUI agents' use of historical context to reduce computational overhead

Addresses distraction from irrelevant information in sequential navigation tasks

Enhances decision-making efficiency while maintaining strong performance benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Context Sampling for adaptive history usage

Anchor-guided History Compression with dual branch strategy

History-enhanced alignment loss for consistent and efficient policy

🔎 Similar Papers

Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate