SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Foundational vision models suffer severe performance degradation on low-SNR video data—such as underwater sonar, ultrasound, and microscopic imaging—where noise fundamentally impairs spatiotemporal feature learning. Method: We propose SAVeD, a self-supervised spatiotemporal denoising framework that requires only the noisy input video for training, eliminating the need for ground-truth annotations. SAVeD introduces a novel motion-aware bottleneck encoder-decoder architecture that explicitly models differential dynamics between foreground and background motion, enabling task-driven, lightweight denoising. It integrates self-supervised learning, motion-sensitive feature disentanglement, and efficient spatiotemporal convolutions. Contribution/Results: On downstream tasks—including classification, detection, tracking, and counting—SAVeD consistently outperforms state-of-the-art video denoising methods, delivering substantial accuracy gains while significantly reducing computational overhead and memory footprint.

Technology Category

Application Category

📝 Abstract
Foundation models excel at vision tasks in natural images but fail in low signal-to-noise ratio (SNR) videos, such as underwater sonar, ultrasound, and microscopy. We introduce Spatiotemporal Augmentations and denoising in Video for Downstream Tasks (SAVeD), a self-supervised method that denoises low-SNR sensor videos and is trained using only the raw noisy data. By leveraging differences in foreground and background motion, SAVeD enhances object visibility using an encoder-decoder with a temporal bottleneck. Our approach improves classification, detection, tracking, and counting, outperforming state-of-the-art video denoising methods with lower resource requirements. Project page: https://suzanne-stathatos.github.io/SAVeD Code page: https://github.com/suzanne-stathatos/SAVeD
Problem

Research questions and friction points this paper is trying to address.

Denoising low-SNR videos for better downstream tasks
Self-supervised method using raw noisy video data
Improving object visibility via motion-based encoder-decoder
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised denoising for low-SNR videos
Encoder-decoder with temporal bottleneck
Leverages foreground-background motion differences
🔎 Similar Papers
No similar papers found.