🤖 AI Summary
Traditional scriptwriting is text-centric and lacks intuitive visualization of audiovisual elements, hindering creative ideation and iterative refinement. Method: We propose the first end-to-end text-to-audiovisual synchronous generation framework specifically designed for film and television dialogue scripts. It integrates large language models, emotion-aware text-to-speech (TTS), 3D character animation, and an interactive UI to enable real-time, fine-grained editing of facial expressions, vocal prosody, camera angles, and other cinematic parameters. Contribution/Results: The framework bridges the modality gap between textual scripts and audiovisual outputs, enabling bidirectional co-creation—where script edits dynamically update scenes and vice versa. A user study with professional and novice screenwriters demonstrates significant improvements in creative iteration efficiency. Participants consistently rated the system as an augmentation tool—not a replacement—for human-centered storytelling, validating its utility as a next-generation collaborative authoring platform.
📝 Abstract
Scriptwriting has traditionally been text-centric, a modality that only partially conveys the produced audiovisual experience. A formative study with professional writers informed us that connecting textual and audiovisual modalities can aid ideation and iteration, especially for writing dialogues. In this work, we present Script2Screen, an AI-assisted tool that integrates scriptwriting with audiovisual scene creation in a unified, synchronized workflow. Focusing on dialogues in scripts, Script2Screen generates expressive scenes with emotional speeches and animated characters through a novel text-to-audiovisual-scene pipeline. The user interface provides fine-grained controls, allowing writers to fine-tune audiovisual elements such as character gestures, speech emotions, and camera angles. A user study with both novice and professional writers from various domains demonstrated that Script2Screen's interactive audiovisual generation enhances the scriptwriting process, facilitating iterative refinement while complementing, rather than replacing, their creative efforts.