🤖 AI Summary
This work addresses the limitation of current large language model (LLM)-based intelligent assistants, which rely heavily on explicit user instructions and struggle to capture implicit user intent, thereby constraining interaction naturalness and personalization. To overcome this, the study introduces eye-tracking data as an implicit contextual signal into text-based LLM tasks, proposing a novel paradigm for personalized inference without requiring explicit prompts. By designing multiple eye-movement representations and integrating them into the LLM prompting mechanism, the system automatically generates customized summaries in reading scenarios. Experimental results demonstrate that the proposed approach effectively leverages eye-tracking signals to enhance summary quality, significantly improving user experience and downstream task support in real-world applications.
📝 Abstract
Smart glasses are accelerating progress toward more seamless and personalized LLM-based assistance by integrating multimodal inputs. Yet, these inputs rely on obtrusive explicit prompts. The advent of gaze tracking on smart devices offers a unique opportunity to extract implicit user intent for personalization. This paper investigates whether LLMs can interpret user gaze for text-based tasks. We evaluate different gaze representations for personalization and validate their effectiveness in realistic reading tasks. Results show that LLMs can leverage gaze to generate high-quality personalized summaries and support users in downstream tasks, highlighting the feasibility and value of gaze-driven personalization for future mobile and wearable LLM applications.