OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
Existing shot boundary detection methods struggle to accurately identify subtle transitions due to limitations imposed by noisy, low-diversity human annotations and the absence of a modern, comprehensive evaluation benchmark. This work reframes the task as structured relational prediction and introduces Shot-Query Transformer—the first end-to-end framework that jointly models intra- and inter-shot relationships through a shot query mechanism to densely capture temporal dependencies among video frames. To address data scarcity and diversity, the authors also develop a fully automatic, parameterized pipeline for synthesizing video transitions, enabling large-scale, diverse training data generation. They further release OmniShotCutBench, a comprehensive benchmark spanning multiple domains. The proposed method achieves significant improvements in both accuracy and interpretability for hard and soft cuts, consistently outperforming existing approaches across the new benchmark.

Technology Category

Application Category

📝 Abstract
Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot ranges with intra-shot relations and inter-shot relations, by a shot query-based dense video Transformer. To avoid imprecise manual labeling, we adopt a fully synthetic transition synthesis pipeline that automatically reproduces major transition families with precise boundaries and parameterized variants. We also introduce OmniShotCutBench, a modern wide-domain benchmark enabling holistic and diagnostic evaluation.
Problem

Research questions and friction points this paper is trying to address.

Shot Boundary Detection
video transitions
annotation noise
benchmark limitations
discontinuity detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Shot Boundary Detection
Shot-Query Transformer
Relational Prediction
Synthetic Transition Synthesis
Video Benchmark
🔎 Similar Papers