🤖 AI Summary
This work proposes RemoteVAR, a novel framework addressing the limitations of autoregressive models in remote sensing change detection—namely, weak controllability, suboptimal dense prediction performance, and exposure bias. RemoteVAR represents the first effective application of vision autoregressive models to this task, introducing a multi-resolution bitemporal feature fusion scheme and a cross-attention mechanism to conditionally constrain the autoregressive generation process. Furthermore, it incorporates a tailored autoregressive training strategy specifically designed for change map prediction. Experimental results demonstrate that RemoteVAR significantly outperforms both diffusion-based models and Transformer baselines on standard remote sensing change detection benchmarks, thereby establishing the competitiveness of autoregressive approaches in this domain.
📝 Abstract
Remote sensing change detection aims to localize and characterize scene changes between two time points and is central to applications such as environmental monitoring and disaster assessment. Meanwhile, visual autoregressive models (VARs) have recently shown impressive image generation capability, but their adoption for pixel-level discriminative tasks remains limited due to weak controllability, suboptimal dense prediction performance and exposure bias. We introduce RemoteVAR, a new VAR-based change detection framework that addresses these limitations by conditioning autoregressive prediction on multi-resolution fused bi-temporal features via cross-attention, and by employing an autoregressive training strategy designed specifically for change map prediction. Extensive experiments on standard change detection benchmarks show that RemoteVAR delivers consistent and significant improvements over strong diffusion-based and transformer-based baselines, establishing a competitive autoregressive alternative for remote sensing change detection. Code will be available \href{https://github.com/yilmazkorkmaz1/RemoteVAR}{\underline{here}}.