🤖 AI Summary
This work addresses the limited robustness of SAM-based dense trackers in long-term occlusion, rapid motion, viewpoint changes, and cluttered scenes—particularly for small targets—by introducing a memory-augmented mechanism that operates without modifying the backbone network. The proposed approach employs a reliability-aware state machine that maintains single-path propagation under high-confidence conditions but activates multi-branch candidate generation and defers memory updates during low-confidence episodes. Additionally, it explicitly preserves the initial-frame anchor, expands the memory budget, and integrates a selective SAM³ memory strategy with a delayed DRM (Delay-Resolved Memory) enhancement scheme. This design significantly improves re-identification capability after occlusion and long-term tracking robustness for small objects, all while maintaining efficient inference.
📝 Abstract
SAM-based dense trackers provide strong short-term mask propagation but remain fragile under long occlusion, fast motion, viewpoint change, and distractors. The problem is especially severe for small objects, where a few incorrect memory updates can dominate later predictions. This report presents an occlusion- and reappearance-aware extension of DAM4SAM that improves memory control rather than changing the backbone. The method augments the original SAM3 tracker with four ingredients: a reliability-aware tracking state machine, branch-based recovery, delayed DRM promotion, and a selective policy for native SAM3 memory selection. During stable tracking, the model follows the original single-path propagation process. Once confidence drops, the tracker enters an ambiguous or recovery mode, maintains a small set of candidate branches, and commits memory only after a branch is reconfirmed. For small-object disappearance and reappearance, native memory selection is temporarily bypassed so older anchors remain accessible. In addition, the first conditioning frame is explicitly preserved, and the conditioning-memory budget is moderately enlarged to improve long-gap recovery. The resulting design keeps DAM4SAM efficient in easy cases while improving robustness in sequences dominated by occlusion and reappearance.