A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy Reduction

📅 2026-05-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

228K/year
🤖 AI Summary
Current GUI agents rely on accessibility trees that are often redundant and lack explicit spatial structure, limiting observation efficiency and task performance. This work proposes Compressed-a11y, a lightweight transformation pipeline that, for the first time, integrates modality detection, redundancy compression, and semantic structuring to convert linear accessibility trees into compact, structured representations that preserve critical spatial relationships. Evaluated on the OSWorld benchmark, the method reduces input token length to 22% of the original while improving average task success rate by 5.1 percentage points.
📝 Abstract
AI agents that interact with graphical user interfaces (GUIs) require effective observation representations for reliable grounding. The accessibility tree is a commonly used text-based format that encodes UI element attributes, but it suffers from redundancy and lacks structural information such as spatial relationships among elements. We propose A11y-Compressor, a framework that transforms linearized accessibility trees into compact and structured representations. Our implementation, Compressed-a11y, applies a lightweight and structured transformation pipeline with modal detection, redundancy reduction, and semantic structuring. Experiments on the OSWorld benchmark show that Compressed-a11y reduces input tokens to 22% of the original while improving task success rates by 5.1 percentage points on average.
Problem

Research questions and friction points this paper is trying to address.

GUI agents
accessibility tree
redundancy
spatial relationships
observation representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

A11y-Compressor
accessibility tree compression
GUI agent observation
redundancy reduction
structured representation