Investigating Use Cases of AI-Powered Scene Description Applications for Blind and Low Vision People

📅 2024-03-22
🏛️ International Conference on Human Factors in Computing Systems
📈 Citations: 16
Influential: 5
📄 PDF
🤖 AI Summary
This study investigates the real-world efficacy of fully AI-driven (i.e., non–human-assisted) scene description tools for people who are blind or have low vision, addressing critical gaps in usability, reliability, and trustworthiness. Method: We conducted a two-week diary study, semi-structured interviews, and thematic coding with 16 participants using a custom multimodal AI application, systematically identifying frequent and emergent usage patterns—including known-object characterization and hazard avoidance. Contribution/Results: Findings reveal low user satisfaction (2.76/5) and trust (2.43/4), exposing fundamental limitations in AI-generated descriptions regarding accuracy, robustness, and explainability. To our knowledge, this is the first empirical characterization of authentic usage patterns of fully automated visual assistive AI. The study provides actionable, user-informed evidence and concrete design directions for developing trustworthy, accessible AI tools—advancing both HCI research and inclusive AI engineering.

Technology Category

Application Category

📝 Abstract
"Scene description" applications that describe visual content in a photo are useful daily tools for blind and low vision (BLV) people. Researchers have studied their use, but they have only explored those that leverage remote sighted assistants; little is known about applications that use AI to generate their descriptions. Thus, to investigate their use cases, we conducted a two-week diary study where 16 BLV participants used an AI-powered scene description application we designed. Through their diary entries and follow-up interviews, users shared their information goals and assessments of the visual descriptions they received. We analyzed the entries and found frequent use cases, such as identifying visual features of known objects, and surprising ones, such as avoiding contact with dangerous objects. We also found users scored the descriptions relatively low on average, 2.76 out of 5 (SD=1.49) for satisfaction and 2.43 out of 4 (SD=1.16) for trust, showing that descriptions still need significant improvements to deliver satisfying and trustworthy experiences. We discuss future opportunities for AI as it becomes a more powerful accessibility tool for BLV users.
Problem

Research questions and friction points this paper is trying to address.

Explores AI-powered scene description applications for blind and low vision users.
Identifies use cases like object identification and danger avoidance.
Highlights low user satisfaction and trust in current AI descriptions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-powered scene description application for BLV
Two-week diary study with 16 BLV participants
Analyzed user satisfaction and trust in AI descriptions
🔎 Similar Papers
No similar papers found.
R
Ricardo E. Gonzalez Penuela
Cornell Tech
Jazmin Collins
Jazmin Collins
PhD Student, Cornell University
virtual realityhuman-computer interactionaccessibilityneurodivergence
C
Cynthia L. Bennett
Google
S
Shiri Azenkot
Cornell Tech