Turbulence-Robust Dynamic Object Segmentation with Multi-Signal Priors and SAM2 Refinement

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the challenge of dynamic object segmentation under atmospheric turbulence, which introduces artifacts such as spurious motion, blur, and intermittent visibility. To tackle this problem without requiring model training, the authors propose a training-free, multi-signal fusion inference framework that integrates RAFT optical flow estimation, DINOv2 self-supervised semantic priors, and ViBe background anomaly modeling. These complementary cues are further refined through bounding-box prompting with the pre-trained SAM2 model to produce robust segmentation masks in pure inference mode. By circumventing reliance on end-to-end models that are sensitive to unstable motion cues, the proposed method significantly enhances segmentation performance in turbulent conditions, achieving state-of-the-art results of 0.425 mIoU and 0.457 mDice in the CVPR 2026 UG2+ Challenge Track 3.

📝 Abstract

This technical report presents our solution for the CVPR 2026 UG2+ Challenge Track 3: Dynamic Object Segmentation in Turbulence (DOST). We design a training-free multi-signal segmentation pipeline that combines pretrained motion estimation, self-supervised semantic priors, background anomaly modeling, manually calibrated proposal fusion, and SAM2-based mask refinement. The method uses RAFT for dense motion responses, DINOv2 for semantic objectness priors, ViBe for training-free background modeling, and pretrained SAM2 for box-prompt mask refinement. Instead of optimizing an end-to-end segmentation network, our system operates entirely in inference mode. This design is suitable for the DOST setting, where severe atmospheric turbulence produces pseudo-motion, blur, and intermittent target visibility, making a single motion cue unreliable. The final submitted masks are evaluated by the official leaderboard, which reports 0.425041 mIoU and 0.457206 mDice. Since no task-specific model training or fine-tuning is performed, stronger learned temporal association, adaptive proposal selection, or task-specific adaptation may further improve the system.

Problem

Research questions and friction points this paper is trying to address.

dynamic object segmentation

atmospheric turbulence

pseudo-motion

intermittent visibility

blur

Innovation

Methods, ideas, or system contributions that make the work stand out.

turbulence-robust segmentation

training-free pipeline

multi-signal fusion