PageGuide: Browser extension to assist users in navigating a webpage and locating information

πŸ“… 2026-04-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

200K/year
πŸ€– AI Summary
Users often struggle to efficiently locate information, execute multi-step tasks, and resist distractions while browsing web pages, yet existing AI assistants lack visual grounding to page elements. This work proposes PageGuideβ€”a browser extension that, for the first time, directly anchors large language model (LLM) outputs to the HTML DOM structure and implements three interaction modes via a frontend visual overlay: Find (highlighting target elements), Guide (step-by-step task guidance), and Hide (suppressing distracting content). This approach introduces a triple innovation of verifiable answers, followable actions, and a decluttered interface. User studies demonstrate that Hide improves accuracy by 26% and reduces task time by 70%; Guide increases task completion rates by 30%; and Find decreases reliance on Ctrl+F by 80% while reducing search time by 19%.

Technology Category

Application Category

πŸ“ Abstract
Users browsing the web daily struggle to quickly locate relevant information in cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants (e.g., ChatGPT, Gemini, Claude) and browser agents (e.g., OpenAI Operator, Browser Use) can answer questions and automate actions, yet they return answers without showing where the information comes from on the page, forcing users to manually verify results and blindly trust every automated steps. We present PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs: (a) Find-locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page; (b) Guide-showing step-by-step instructions (e.g. how to change password) one at a time so users can follow and perform actions by themselves; and (c) Hide-hiding distracting content-giving users a chance to decide to hide an element or not. In a user study (N=94), PageGuide outperform unaided browsing across all modes: Hide accuracy improve by 26 percentage points (86.7% relative gain) and task completion time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces manual search effort, with Ctrl+F usage falling by 80% and task time decreasing by 19%. Code and demo is at: pageguide.github.io.
Problem

Research questions and friction points this paper is trying to address.

web navigation
information location
distracting content
user verification
multi-step tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

browser extension
LLM grounding
visual overlay
DOM anchoring
user-guided navigation
πŸ”Ž Similar Papers
No similar papers found.