ADCanvas: Accessible and Conversational Audio Description Authoring for Blind and Low Vision Creators

πŸ“… 2026-02-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of existing audio description (AD) tools, which rely heavily on visual interfaces and thus fail to meet the needs of blind and low-vision (BLV) video creators. The authors propose ADCanvas, the first end-to-end AD authoring system that integrates a conversational multimodal large language model with an accessible editing environment. Designed for full compatibility with screen readers, keyboard-driven playback controls, and plain-text editing, ADCanvas also incorporates real-time visual question answering (VQA) capabilities. A user study with 12 BLV creators demonstrates that the system effectively serves as both an informational assistant and a draft-generation tool, enabling users to efficiently produce and refine AD scripts while retaining full creative control. These findings validate the system’s usability and practical utility in real-world AD creation workflows.

Technology Category

Application Category

πŸ“ Abstract
Audio Description (AD) provides essential access to visual media for blind and low vision (BLV) audiences. Yet current AD production tools remain largely inaccessible to BLV video creators, who possess valuable expertise but face barriers due to visually-driven interfaces. We present ADCanvas, a multimodal authoring system that supports non-visual control over audio description (AD) creation. ADCanvas combines conversational interaction with keyboard-based playback control and a plain-text, screen reader-accessible editor to support end-to-end AD authoring and visual question answering (VQA). Combining screen-reader-friendly controls with a multimodal LLM agent, ADCanvas supports live VQA, script generation, and AD modification. Through a user study with 12 BLV video creators, we find that users adopt the conversational agent as an informational aide and drafting assistant, while maintaining agency through verification and editing. For example, participants saw themselves as curators who received information from the model and filtered it down for their audience. Our findings offer design implications for accessible media tools, including precise editing controls, accessibility support for creative ideation, and configurable rules for human-AI collaboration.
Problem

Research questions and friction points this paper is trying to address.

Audio Description
Accessibility
Blind and Low Vision
Media Authoring
Inclusive Design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Audio Description
Accessible Authoring
Conversational AI
Multimodal LLM
Visual Question Answering
πŸ”Ž Similar Papers
No similar papers found.