CI4A: Semantic Component Interfaces for Agents Empowering Web Automation

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of large language models in fine-grained web interaction despite their strong performance in high-level semantic planning. To bridge this gap, the authors propose CI4A, a mechanism that abstracts complex UI component interactions into unified tool primitives through semantic encapsulation, thereby constructing an agent-optimized interaction interface that transcends traditional human-centric UI constraints. Implemented on Ant Design, CI4A covers 23 common UI components and features a hybrid agent architecture with a dynamically updated action space conditioned on page state. Evaluated on a reconstructed WebArena benchmark, the CI4A agent achieves a task success rate of 86.3%, substantially outperforming existing methods while significantly improving execution efficiency.

Technology Category

Application Category

📝 Abstract
While Large Language Models demonstrate remarkable proficiency in high-level semantic planning, they remain limited in handling fine-grained, low-level web component manipulations. To address this limitation, extensive research has focused on enhancing model grounding capabilities through techniques such as Reinforcement Learning. However, rather than compelling agents to adapt to human-centric interfaces, we propose constructing interaction interfaces specifically optimized for agents. This paper introduces Component Interface for Agent (CI4A), a semantic encapsulation mechanism that abstracts the complex interaction logic of UI components into a set of unified tool primitives accessible to agents. We implemented CI4A within Ant Design, an industrial-grade front-end framework, covering 23 categories of commonly used UI components. Furthermore, we developed a hybrid agent featuring an action space that dynamically updates according to the page state, enabling flexible invocation of available CI4A tools. Leveraging the CI4A-integrated Ant Design, we refactored and upgraded the WebArena benchmark to evaluate existing SoTA methods. Experimental results demonstrate that the CI4A-based agent significantly outperforms existing approaches, achieving a new SoTA task success rate of 86.3%, alongside substantial improvements in execution efficiency.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Web Automation
UI Components
Agent Interaction
Semantic Planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

CI4A
semantic component interface
agent-oriented UI
web automation
tool abstraction
🔎 Similar Papers
No similar papers found.
Z
Zhi Qiu
School of Cyberspace Science and Technology, Beijing Institute of Technology
J
Jiazheng Sun
School of Cyberspace Science and Technology, Beijing Institute of Technology; College of Computer Science and Artificial Intelligence, Fudan University
C
Chenxiao Xia
School of Cyberspace Science and Technology, Beijing Institute of Technology
J
Jun Zheng
School of Cyberspace Science and Technology, Beijing Institute of Technology
Xin Peng
Xin Peng
East China University of Science and Technology
Artificial IntelligenceMachine LearningComplex Process Modeling