🤖 AI Summary
This work addresses the inefficiency of cross-modal registration and reliance on complex preprocessing in infrared and visible image fusion by proposing a visual prior-guided unified registration-and-fusion framework. The method eliminates the need for pre-registration by embedding alignment directly into the fusion process, explicitly modeling and correcting misalignments in critical regions through cross-modal misalignment representation learning and visual prior-driven local alignment, thereby avoiding forced global correspondence. Evaluated on three benchmark datasets, the approach achieves state-of-the-art fusion quality while significantly improving detail alignment accuracy and robustness to input perturbations, and it remains compatible with diverse fusion architectures.
📝 Abstract
Spatial registration across different visual modalities is a critical but formidable step in multi-modality image fusion for real-world perception. Although several methods are proposed to address this issue, the existing registration-based fusion methods typically require extensive pre-registration operations, limiting their efficiency. To overcome these limitations, a general cross-modality registration method guided by visual priors is proposed for infrared and visible image fusion task, termed FusionRegister. Firstly, FusionRegister achieves robustness by learning cross-modality misregistration representations rather than forcing alignment of all differences, ensuring stable outputs even under challenging input conditions. Moreover, FusionRegister demonstrates strong generality by operating directly on fused results, where misregistration is explicitly represented and effectively handled, enabling seamless integration with diverse fusion methods while preserving their intrinsic properties. In addition, its efficiency is further enhanced by serving the backbone fusion method as a natural visual prior provider, which guides the registration process to focus only on mismatch regions, thereby avoiding redundant operations. Extensive experiments on three datasets demonstrate that FusionRegister not only inherits the fusion quality of state-of-the-art methods, but also delivers superior detail alignment and robustness, making it highly suitable for infrared and visible image fusion method. The code will be available at https://github.com/bociic/FusionRegister.