STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing narrative understanding benchmarks are often confined to individual subtasks, limiting their ability to evaluate models’ capacity for constructing coherent narrative worlds and performing cross-task reasoning and generation. To address this gap, this work proposes STAGE, a unified benchmark built upon 150 high-quality English–Chinese film scripts that, for the first time, integrates four core tasks—knowledge graph construction, scene-level event summarization, long-context question answering, and role-playing—within a shared narrative world representation framework. STAGE enables comprehensive, cross-lingual, and multi-dimensional evaluation, encompassing critical capabilities such as script preprocessing, event and character annotation, long-range reasoning, and character-consistent generation, thereby offering a holistic assessment of models in world modeling, event abstraction, long-context comprehension, and persona-aware response generation.

Technology Category

Application Category

📝 Abstract
Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target individual subtasks such as question answering or dialogue generation, they rarely evaluate whether models can construct a coherent story world and use it consistently across multiple forms of reasoning and generation. We introduce STAGE (Screenplay Text, Agents, Graphs and Evaluation), a unified benchmark for narrative understanding over full-length movie screenplays. STAGE defines four tasks: knowledge graph construction, scene-level event summarization, long-context screenplay question answering, and in-script character role-playing, all grounded in a shared narrative world representation. The benchmark provides cleaned scripts, curated knowledge graphs, and event- and character-centric annotations for 150 films across English and Chinese, enabling holistic evaluation of models'abilities to build world representations, abstract and verify narrative events, reason over long narratives, and generate character-consistent responses.
Problem

Research questions and friction points this paper is trying to address.

knowledge graph construction
question answering
role-playing
narrative understanding
movie screenplays
Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge graph construction
narrative understanding
long-context reasoning
character-consistent generation
multitask benchmark
🔎 Similar Papers
No similar papers found.
Q
Qiuyu Tian
Southeast University, Nanjing, China
Y
Yiding Li
ZhuiWen Technology Co., Ltd., Beijing, China
F
Fengyi Chen
Nanjing Normal University, Nanjing, China
Zequn Liu
Zequn Liu
Microsoft Research AI4Science, Asia
Youyong Kong
Youyong Kong
Associate Professor at School of Computer Science and Engineering, Southeast University
medical image processingmachine learningbrain network analysis
Fan Guo
Fan Guo
Los Alamos National Laboratory
Particle accelerationMagnetic ReconnectionCosmic raysPlasma AstrophysicsSpace Physics
Y
Yuyao Li
ZhuiWen Technology Co., Ltd., Beijing, China
J
Jinjing Shen
ZhuiWen Technology Co., Ltd., Beijing, China
Z
Zhijing Xie
ZhuiWen Technology Co., Ltd., Beijing, China
Y
Yiyun Luo
ZhuiWen Technology Co., Ltd., Beijing, China
X
Xin Zhang
ZhuiWen Technology Co., Ltd., Beijing, China