π€ AI Summary
To address coarse-grained shot segmentation, weak semantic and temporal contextual search, and absence of map-based navigation in large-scale video interactive retrieval, this paper redesigns the diveXplore 6.0 system architecture. We propose an adaptive keyframe-driven fine-grained shot segmentation strategy; design a map-assisted search mechanism integrating geographic coordinates and semantic labels; and build a spatiotemporal joint index enabling conceptβtime dual-dimensional interactive exploration. The system employs a lightweight frontend framework, incorporating keyframe analysis, multimodal concept detection, and visualization-guided navigation. Evaluated on the VBS2021 benchmark, it achieves a 37% reduction in response latency and a 12% improvement in retrieval accuracy. Despite functional streamlining, interactive efficiency is significantly enhanced. The resulting system delivers high-performance, scalable exploratory video retrieval support for VBS2022.
π Abstract
Continuously participating since the sixth Video Browser Showdown (VBS2017), diveXplore is a veteran interactive search system that throughout its lifetime has offered and evaluated numerous features. After undergoing major refactoring for the most recent VBS2021, however, the system since version 5.0 is less feature rich, yet, more modern, leaner and faster than the original system. This proved to be a sensible decision as the new system showed increasing performance in VBS2021 when compared to the most recent former competitions. With version 6.0 we reconsider shot segmentation, map search and introduce new features for improving concept as well as temporal context search.