๐ค AI Summary
This work addresses the limitations of existing video motion magnification methods, which often suffer from structural inconsistencies under complex geometric transformations and are constrained by limited receptive fields, high computational costs, and a lack of training data that captures realistic geometric and imaging complexities. To overcome these challenges, we propose GeoMag, a novel framework that introduces state space models to this task for the first time, enabling globally consistent motion magnification with linear computational complexity. Additionally, we construct Geo-200K, the first large-scale synthetic dataset that integrates complex geometric transformations with realistic sensor-induced degradations. Experiments demonstrate that GeoMag significantly outperforms current methods on both synthetic and real-world data, achieving superior visual fidelity and structural consistency while effectively suppressing artifacts and maintaining higher computational efficiency.
๐ Abstract
Video Motion Magnification (VMM) reveals imperceptible dynamics but often suffers from structural inconsistencies under complex geometric transformations. Existing learning-based methods generally face a trade-off between the limited global context of CNNs and the high computational cost of Transformers. In addition, current training protocols, largely dominated by simple linear motion, fail to capture the geometric and imaging complexities encountered in real-world videos. To address these issues, we propose GeoMag, a geometric-aware VMM framework built upon State Space Models to achieve globally consistent motion amplification with linear complexity. We further construct Geo-200K, a large-scale synthetic dataset that introduces rich geometric transformations together with sensor-realistic degradations, improving the diversity and realism of training signals. Extensive experiments on synthetic and real-world benchmarks show that GeoMag consistently outperforms prior methods in visual fidelity and computational efficiency, while producing fewer artifacts and better structural consistency.