SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

📅 2025-04-30

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing multimodal large language models (MLLMs) are evaluated on video understanding tasks using isolated frames or single videos, failing to capture continuous, narrative-driven sequences prevalent in real-world scenarios. Method: We introduce SeriesBench, the first multi-task benchmark for episode-level narrative understanding, comprising 105 TV episodes and 28 fine-grained narrative tasks. It features a novel long-horizon narrative annotation scheme and a full-information task auto-conversion mechanism. We further propose PC-DCoT, a reasoning framework that explicitly models plot-level causal chains and dynamic character interactions. Contribution/Results: Experiments reveal significant bottlenecks in current MLLMs’ episode-level narrative comprehension. PC-DCoT boosts average accuracy of mainstream models on SeriesBench by 19.7%. The benchmark is publicly released and accepted at CVPR 2025.

Technology Category

Application Category

📝 Abstract

With the rapid development of Multi-modal Large Language Models (MLLMs), an increasing number of benchmarks have been established to evaluate the video understanding capabilities of these models. However, these benchmarks focus on extbf{standalone} videos and mainly assess ``visual elements'' like human actions and object states. In reality, contemporary videos often encompass complex and continuous narratives, typically presented as a extbf{series}. To address this challenge, we propose extbf{SeriesBench}, a benchmark consisting of 105 carefully curated narrative-driven series, covering 28 specialized tasks that require deep narrative understanding. Specifically, we first select a diverse set of drama series spanning various genres. Then, we introduce a novel long-span narrative annotation method, combined with a full-information transformation approach to convert manual annotations into diverse task formats. To further enhance model capacity for detailed analysis of plot structures and character relationships within series, we propose a novel narrative reasoning framework, extbf{PC-DCoT}. Extensive results on extbf{SeriesBench} indicate that existing MLLMs still face significant challenges in understanding narrative-driven series, while extbf{PC-DCoT} enables these MLLMs to achieve performance improvements. Overall, our extbf{SeriesBench} and extbf{PC-DCoT} highlight the critical necessity of advancing model capabilities to understand narrative-driven series, guiding the future development of MLLMs. SeriesBench is publicly available at https://github.com/zackhxn/SeriesBench-CVPR2025.

Problem

Research questions and friction points this paper is trying to address.

Evaluating MLLMs' narrative understanding in drama series

Addressing lack of benchmarks for continuous video narratives

Improving model analysis of plot structures and character relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces SeriesBench for narrative-driven series evaluation

Develops long-span narrative annotation method

Proposes PC-DCoT framework for narrative reasoning

🔎 Similar Papers

No similar papers found.