🤖 AI Summary
This work addresses the novel task of fine-grained user intent recognition from UI interaction traces. To tackle this challenge, we propose the first systematic solution: (1) We formally define the UI trace intent recognition task, moving beyond conventional coarse-grained behavioral modeling; (2) We introduce a UI-aware semantic equivalence evaluation framework to address the inherent ambiguity and lexical diversity in intent descriptions; (3) Leveraging real-world, multi-source UI traces from Android and Web platforms, we construct the first reproducible benchmark—comprising both training and evaluation splits—via automated dataset curation. Experiments employ GPT-4 and Gemini-1.5 Pro for intent generation and paraphrase discrimination, revealing a substantial performance gap between current SOTA LLMs and human annotators—thereby validating the task’s difficulty and research significance. This work establishes foundational infrastructure for personalized agents, search optimization, and deeper user behavior understanding.
📝 Abstract
Identifying underlying user goals and intents has been recognized as valuable in various personalization-oriented settings, such as personalized agents, improved search responses, advertising, user analytics, and more. In this paper, we propose a new task goal identification from observed UI trajectories aiming to infer the user's detailed intentions when performing a task within UI environments. To support this task, we also introduce a novel evaluation methodology designed to assess whether two intent descriptions can be considered paraphrases within a specific UI environment. Furthermore, we demonstrate how this task can leverage datasets designed for the inverse problem of UI automation, utilizing Android and web datasets for our experiments. To benchmark this task, we compare the performance of humans and state-of-the-art models, specifically GPT-4 and Gemini-1.5 Pro, using our proposed metric. The results reveal that both Gemini and GPT underperform relative to human performance, underscoring the challenge of the proposed task and the significant room for improvement. This work highlights the importance of goal identification within UI trajectories, providing a foundation for further exploration and advancement in this area.