CI4A: Semantic Component Interfaces for Agents Empowering Web Automation

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of large language models in fine-grained web interaction despite their strong performance in high-level semantic planning. To bridge this gap, the authors propose CI4A, a mechanism that abstracts complex UI component interactions into unified tool primitives through semantic encapsulation, thereby constructing an agent-optimized interaction interface that transcends traditional human-centric UI constraints. Implemented on Ant Design, CI4A covers 23 common UI components and features a hybrid agent architecture with a dynamically updated action space conditioned on page state. Evaluated on a reconstructed WebArena benchmark, the CI4A agent achieves a task success rate of 86.3%, substantially outperforming existing methods while significantly improving execution efficiency.

Technology Category

Application Category

📝 Abstract

While Large Language Models demonstrate remarkable proficiency in high-level semantic planning, they remain limited in handling fine-grained, low-level web component manipulations. To address this limitation, extensive research has focused on enhancing model grounding capabilities through techniques such as Reinforcement Learning. However, rather than compelling agents to adapt to human-centric interfaces, we propose constructing interaction interfaces specifically optimized for agents. This paper introduces Component Interface for Agent (CI4A), a semantic encapsulation mechanism that abstracts the complex interaction logic of UI components into a set of unified tool primitives accessible to agents. We implemented CI4A within Ant Design, an industrial-grade front-end framework, covering 23 categories of commonly used UI components. Furthermore, we developed a hybrid agent featuring an action space that dynamically updates according to the page state, enabling flexible invocation of available CI4A tools. Leveraging the CI4A-integrated Ant Design, we refactored and upgraded the WebArena benchmark to evaluate existing SoTA methods. Experimental results demonstrate that the CI4A-based agent significantly outperforms existing approaches, achieving a new SoTA task success rate of 86.3%, alongside substantial improvements in execution efficiency.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Web Automation

UI Components

Agent Interaction

Semantic Planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

CI4A

semantic component interface

agent-oriented UI

web automation

tool abstraction

🔎 Similar Papers

No similar papers found.

Authors to Follow