Let's Split Up: Zero-Shot Classifier Edits for Fine-Grained Video Understanding

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video recognition models rely on fixed, coarse-grained taxonomies that struggle to adapt cost-effectively to evolving demands for fine-grained categories. This work introduces and addresses, for the first time, the zero-shot category splitting problem for video classifiers: by uncovering the latent compositional structure within a trained classifier, it automatically refines coarse categories into meaningful subcategories without requiring any additional labeled data. The approach further incorporates few-shot fine-tuning to enhance performance on the newly split classes. Evaluated on a newly established video category splitting benchmark, our method significantly outperforms vision-language baselines, achieving substantial gains in accuracy on novel subcategories while preserving original classification performance on parent categories.

Technology Category

Application Category

📝 Abstract
Video recognition models are typically trained on fixed taxonomies which are often too coarse, collapsing distinctions in object, manner or outcome under a single label. As tasks and definitions evolve, such models cannot accommodate emerging distinctions and collecting new annotations and retraining to accommodate such changes is costly. To address these challenges, we introduce category splitting, a new task where an existing classifier is edited to refine a coarse category into finer subcategories, while preserving accuracy elsewhere. We propose a zero-shot editing method that leverages the latent compositional structure of video classifiers to expose fine-grained distinctions without additional data. We further show that low-shot fine-tuning, while simple, is highly effective and benefits from our zero-shot initialization. Experiments on our new video benchmarks for category splitting demonstrate that our method substantially outperforms vision-language baselines, improving accuracy on the newly split categories without sacrificing performance on the rest. Project page: https://kaitingliu.github.io/Category-Splitting/.
Problem

Research questions and friction points this paper is trying to address.

video recognition
fine-grained understanding
category splitting
zero-shot editing
classifier adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

category splitting
zero-shot editing
fine-grained video understanding
compositional structure
video recognition
🔎 Similar Papers
2024-02-20International Conference on Machine LearningCitations: 30