STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays

📅 2026-01-13

📈 Citations: 0

✨ Influential: 0

career value

147K/year

🤖 AI Summary

Existing narrative understanding benchmarks are often confined to individual subtasks, limiting their ability to evaluate models’ capacity for constructing coherent narrative worlds and performing cross-task reasoning and generation. To address this gap, this work proposes STAGE, a unified benchmark built upon 150 high-quality English–Chinese film scripts that, for the first time, integrates four core tasks—knowledge graph construction, scene-level event summarization, long-context question answering, and role-playing—within a shared narrative world representation framework. STAGE enables comprehensive, cross-lingual, and multi-dimensional evaluation, encompassing critical capabilities such as script preprocessing, event and character annotation, long-range reasoning, and character-consistent generation, thereby offering a holistic assessment of models in world modeling, event abstraction, long-context comprehension, and persona-aware response generation.

Technology Category

Application Category

📝 Abstract

Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target individual subtasks such as question answering or dialogue generation, they rarely evaluate whether models can construct a coherent story world and use it consistently across multiple forms of reasoning and generation. We introduce STAGE (Screenplay Text, Agents, Graphs and Evaluation), a unified benchmark for narrative understanding over full-length movie screenplays. STAGE defines four tasks: knowledge graph construction, scene-level event summarization, long-context screenplay question answering, and in-script character role-playing, all grounded in a shared narrative world representation. The benchmark provides cleaned scripts, curated knowledge graphs, and event- and character-centric annotations for 150 films across English and Chinese, enabling holistic evaluation of models'abilities to build world representations, abstract and verify narrative events, reason over long narratives, and generate character-consistent responses.

Problem

Research questions and friction points this paper is trying to address.

knowledge graph construction

question answering

role-playing

narrative understanding

movie screenplays

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge graph construction

narrative understanding

long-context reasoning