MOVE: Motion-Guided Few-Shot Video Object Segmentation

πŸ“… 2025-07-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing few-shot video object segmentation (FSVOS) methods neglect motion dynamics and over-rely on static category priors. To address this, we propose a motion-guided FSVOS paradigm. First, we introduce MOVEβ€”the first few-shot video segmentation benchmark explicitly designed for motion pattern modeling. Second, we design the Decoupled Motion-Appearance (DMA) network, which explicitly separates and models motion cues and appearance representations via optical flow alignment, temporal modeling, and multi-scale feature fusion. Third, we conduct systematic evaluations of six state-of-the-art methods under two standard FSVOS settings, demonstrating that DMA significantly outperforms existing approaches and achieves new state-of-the-art performance on MOVE. This work pioneers the integration of motion awareness into few-shot video object segmentation, establishing both a novel benchmark and a principled methodological framework for dynamic scene understanding.

Technology Category

Application Category

πŸ“ Abstract
This work addresses motion-guided few-shot video object segmentation (FSVOS), which aims to segment dynamic objects in videos based on a few annotated examples with the same motion patterns. Existing FSVOS datasets and methods typically focus on object categories, which are static attributes that ignore the rich temporal dynamics in videos, limiting their application in scenarios requiring motion understanding. To fill this gap, we introduce MOVE, a large-scale dataset specifically designed for motion-guided FSVOS. Based on MOVE, we comprehensively evaluate 6 state-of-the-art methods from 3 different related tasks across 2 experimental settings. Our results reveal that current methods struggle to address motion-guided FSVOS, prompting us to analyze the associated challenges and propose a baseline method, Decoupled Motion Appearance Network (DMA). Experiments demonstrate that our approach achieves superior performance in few shot motion understanding, establishing a solid foundation for future research in this direction.
Problem

Research questions and friction points this paper is trying to address.

Segment dynamic objects using few motion-guided examples
Overcome limitations of static category-focused FSVOS methods
Address lack of datasets for motion-guided video segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces MOVE dataset for motion-guided FSVOS
Proposes Decoupled Motion Appearance Network (DMA)
Evaluates 6 state-of-the-art methods comprehensively
πŸ”Ž Similar Papers
No similar papers found.