CoSight: Exploring Viewer Contributions to Online Video Accessibility Through Descriptive Commenting

๐Ÿ“… 2025-08-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Professional audio description (AD) for video content is costly and difficult to scale. Method: This paper introduces a viewer-participatory paradigm for video accessibility, leveraging lightweight in-situ prompting to elicit grounded visual commentary from ordinary viewers during YouTube playback. Grounded in the Fogg Behavior Model, we implement a Chrome extension that integrates accessibility gap detection, context-aware prompt dialogs, fuzzy-comment warnings, and reference aids. Contribution/Results: In a 48-participant user study, 89% of generated comments provided accurate, contextually appropriate visual descriptions. Follow-up interviews confirmed their efficacy in complementing professional ADโ€”particularly for conveying critical visual context and affective details. To our knowledge, this is the first systematic integration of crowdsourced descriptive commentary into video accessibility practice, offering a viable, low-cost pathway toward scalable, inclusive audiovisual content.

Technology Category

Application Category

๐Ÿ“ Abstract
The rapid growth of online video content has outpaced efforts to make visual information accessible to blind and low vision (BLV) audiences. While professional Audio Description (AD) remains the gold standard, it is costly and difficult to scale across the vast volume of online media. In this work, we explore a complementary approach to broaden participation in video accessibility: engaging everyday video viewers at their watching and commenting time. We introduce CoSight, a Chrome extension that augments YouTube with lightweight, in-situ nudges to support descriptive commenting. Drawing from Fogg's Behavior Model, CoSight provides visual indicators of accessibility gaps, pop-up hints for what to describe, reminders to clarify vague comments, and related captions and comments as references. In an exploratory study with 48 sighted users, CoSight helped integrate accessibility contribution into natural viewing and commenting practices, resulting in 89% of comments including grounded visual descriptions. Follow-up interviews with four BLV viewers and four professional AD writers suggest that while such comments do not match the rigor of professional AD, they can offer complementary value by conveying visual context and emotional nuance for understanding the videos.
Problem

Research questions and friction points this paper is trying to address.

Enhancing online video accessibility for BLV audiences
Scaling descriptive commenting by engaging everyday viewers
Bridging gaps between professional AD and viewer contributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chrome extension for YouTube accessibility
Lightweight in-situ nudges for descriptions
Visual indicators and pop-up hints
๐Ÿ”Ž Similar Papers
No similar papers found.
R
Ruolin Wang
Georgia Institution of Technology, Atlanta, GA, United States
X
Xingyu Liu
UCLA HCI Research, Los Angeles, CA, United States
B
Biao Wang
Cornell University, Ithaca, NY, United States
Wayne Zhang
Wayne Zhang
UCLA HCI Research, Los Angeles, CA, United States
Z
Ziqian Liao
Harvard University, Boston, MA, United States
Z
Ziwen Li
UCLA HCI Research, Los Angeles, CA, United States
Amy Pavel
Amy Pavel
Assistant Professor, UC Berkeley EECS
Human-Computer InteractionAccessibilityHuman-AI InteractionVideo
Xiang 'Anthony' Chen
Xiang 'Anthony' Chen
Associate Professor, UCLA
Human-Computer Interaction