🤖 AI Summary
In realistic video restoration, the coupled degradation of motion blur and dynamic exposure variation—prevalent in auto-exposure and low-light capture—is long overlooked. This work is the first to explicitly model this joint degradation mechanism, proposing an end-to-end framework for joint video super-resolution and deblurring. Our method introduces exposure-time-aware modulation layers and optical-flow-guided dynamic filtering modules, integrated within a bidirectional hierarchical refinement architecture to enable long-term, parallel feature propagation and decoupled prior learning. We further establish RED-MS (multi-exposure) and RED-RE (random-exposure), the first dedicated benchmarks for evaluating such coupled degradation. Trained solely on synthetic data, our approach achieves state-of-the-art performance on both benchmarks and the GoPro dataset, significantly improving restoration quality, inference speed, and generalization to real-world videos.
📝 Abstract
Real-world video restoration is plagued by complex degradations from motion coupled with dynamically varying exposure - a key challenge largely overlooked by prior works and a common artifact of auto-exposure or low-light capture. We present FMA-Net++, a framework for joint video super-resolution and deblurring that explicitly models this coupled effect of motion and dynamically varying exposure. FMA-Net++ adopts a sequence-level architecture built from Hierarchical Refinement with Bidirectional Propagation blocks, enabling parallel, long-range temporal modeling. Within each block, an Exposure Time-aware Modulation layer conditions features on per-frame exposure, which in turn drives an exposure-aware Flow-Guided Dynamic Filtering module to infer motion- and exposure-aware degradation kernels. FMA-Net++ decouples degradation learning from restoration: the former predicts exposure- and motion-aware priors to guide the latter, improving both accuracy and efficiency. To evaluate under realistic capture conditions, we introduce REDS-ME (multi-exposure) and REDS-RE (random-exposure) benchmarks. Trained solely on synthetic data, FMA-Net++ achieves state-of-the-art accuracy and temporal consistency on our new benchmarks and GoPro, outperforming recent methods in both restoration quality and inference speed, and generalizes well to challenging real-world videos.