DescribePro: Collaborative Audio Description with Human-AI Interaction

πŸ“… 2025-08-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the longstanding trade-off between human precision and AI efficiency in audio description (AD) production. To bridge this gap, we propose a human–AI collaborative, multimodal AD authoring framework. Methodologically, it integrates multimodal large language model prompting techniques, a collaborative interactive interface supporting forkable editing and tag-based version control, and mechanisms for community sharing and stylistic customization. Our key contributions are threefold: (1) the first unified architecture integrating describer-style preservation, progressive AI-assisted editing, novice onboarding/training, and community co-creation; (2) empirical validation demonstrating significant reductions in repetitive labor and novice cognitive load; and (3) sustained fidelity to professional descriptive style while improving generation efficiency and accessibility quality. The framework exhibits strong adaptability for domain-specific customization and pedagogical deployment, offering scalable support for inclusive media production.

Technology Category

Application Category

πŸ“ Abstract
Audio description (AD) makes video content accessible to millions of blind and low vision (BLV) users. However, creating high-quality AD involves a trade-off between the precision of human-crafted descriptions and the efficiency of AI-generated ones. To address this, we present DescribePro a collaborative AD authoring system that enables describers to iteratively refine AI-generated descriptions through multimodal large language model prompting and manual editing. DescribePro also supports community collaboration by allowing users to fork and edit existing ADs, enabling the exploration of different narrative styles. We evaluate DescribePro with 18 describers (9 professionals and 9 novices) using quantitative and qualitative methods. Results show that AI support reduces repetitive work while helping professionals preserve their stylistic choices and easing the cognitive load for novices. Collaborative tags and variations show potential for providing customizations, version control, and training new describers. These findings highlight the potential of collaborative, AI-assisted tools to enhance and scale AD authorship.
Problem

Research questions and friction points this paper is trying to address.

Balancing precision and efficiency in audio description creation
Enabling iterative refinement of AI-generated descriptions collaboratively
Supporting community collaboration and diverse narrative styles exploration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative AD authoring with human-AI interaction
Multimodal large language model prompting
Community collaboration via forking and editing
πŸ”Ž Similar Papers
No similar papers found.