SweeperBot: Making 3D Browsing Accessible through View Analysis and Visual Question Answering

πŸ“… 2025-11-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Blind and low-vision users face significant challenges in effectively exploring and comparing 3D models due to the lack of accessible, interactive, and semantically rich interfaces. Method: This paper introduces SweeperBotβ€”a novel assistive framework that deeply integrates visual question answering (VQA) into screen reader environments. It combines an optimal viewpoint selection algorithm, generative large models (e.g., 3D-to-text), and recognition-oriented multimodal foundation models to generate fine-grained, intent-driven, and context-aware natural-language descriptions of 3D content in response to user queries. Contribution/Results: Unlike static alt-text, SweeperBot enables active interaction, contextual understanding, and cross-model comparison. Expert evaluation shows that ten blind/low-vision users independently completed complex 3D exploration tasks; blind evaluations by thirty sighted users confirmed high accuracy and credibility of generated descriptions. This work establishes the first VQA paradigm for accessible 3D interaction, substantially advancing the accessibility and usability of 3D content.

Technology Category

Application Category

πŸ“ Abstract
Accessing 3D models remains challenging for Screen Reader (SR) users. While some existing 3D viewers allow creators to provide alternative text, they often lack sufficient detail about the 3D models. Grounded on a formative study, this paper introduces SweeperBot, a system that enables SR users to leverage visual question answering to explore and compare 3D models. SweeperBot answers SR users' visual questions by combining an optimal view selection technique with the strength of generative- and recognition-based foundation models. An expert review with 10 Blind and Low-Vision (BLV) users with SR experience demonstrated the feasibility of using SweeperBot to assist BLV users in exploring and comparing 3D models. The quality of the descriptions generated by SweeperBot was validated by a second survey study with 30 sighted participants.
Problem

Research questions and friction points this paper is trying to address.

Making 3D browsing accessible for screen reader users
Addressing insufficient detail in existing 3D model descriptions
Enabling visual question answering for 3D model exploration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses visual question answering for 3D exploration
Combines optimal view selection with foundation models
Generates descriptions through recognition and generative models
πŸ”Ž Similar Papers
No similar papers found.