A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing video fusion methods predominantly process frames independently, neglecting temporal correlations and consequently suffering from flickering and temporal inconsistency. To address this, we propose Unified Video Fusion (UniVF), the first framework enabling joint spatiotemporal optimization. Our contributions are threefold: (1) We introduce VF-Bench—the first comprehensive Video Fusion Benchmark covering four distinct modalities: infrared–visible, medical, remote sensing, and low-light fusion; (2) We design a flow-guided deformable feature alignment module to enable multi-frame collaborative learning; and (3) We propose a synthetic-data-driven protocol for video-pair construction and joint spatial–temporal quality assessment. Extensive experiments on VF-Bench demonstrate that UniVF consistently outperforms state-of-the-art methods, achieving significant improvements in both temporal consistency and spatial fidelity of fused videos.

Technology Category

Application Category

📝 Abstract

The real world is dynamic, yet most image fusion methods process static frames independently, ignoring temporal correlations in videos and leading to flickering and temporal inconsistency. To address this, we propose Unified Video Fusion (UniVF), a novel framework for temporally coherent video fusion that leverages multi-frame learning and optical flow-based feature warping for informative, temporally coherent video fusion. To support its development, we also introduce Video Fusion Benchmark (VF-Bench), the first comprehensive benchmark covering four video fusion tasks: multi-exposure, multi-focus, infrared-visible, and medical fusion. VF-Bench provides high-quality, well-aligned video pairs obtained through synthetic data generation and rigorous curation from existing datasets, with a unified evaluation protocol that jointly assesses the spatial quality and temporal consistency of video fusion. Extensive experiments show that UniVF achieves state-of-the-art results across all tasks on VF-Bench. Project page: https://vfbench.github.io.

Problem

Research questions and friction points this paper is trying to address.

Addressing temporal inconsistency in video fusion methods

Proposing a unified framework for coherent video fusion

Introducing a benchmark for evaluating video fusion tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

UniVF framework for coherent video fusion

Multi-frame learning with optical flow

VF-Bench benchmark for video fusion

🔎 Similar Papers

VideoPrism: A Foundational Visual Encoder for Video Understanding