MOWA: Multiple-in-One Image Warping Model

📅 2024-04-16

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing image warping methods require task-specific model training, exhibiting poor generalization and limited adaptability to diverse camera models or custom distortions. To address this, we propose MOWA—a unified, multi-task image warping model capable of handling six distinct warping tasks within a single architecture. Our approach introduces three key innovations: (1) a region-pixel two-level motion disentanglement mechanism for enhanced geometric modeling accuracy; (2) a lightweight point-based classifier that dynamically generates task-aware prompts for conditional feature modulation; and (3) end-to-end differentiable warping networks jointly optimized with multi-scale motion estimation. Extensive experiments demonstrate that MOWA consistently outperforms dedicated state-of-the-art models across all six tasks. Moreover, it exhibits strong cross-camera generalization and zero-shot transfer capability, enabling robust adaptation without task-specific retraining.

Technology Category

Application Category

📝 Abstract

While recent image warping approaches achieved remarkable success on existing benchmarks, they still require training separate models for each specific task and cannot generalize well to different camera models or customized manipulations. To address diverse types of warping in practice, we propose a Multiple-in-One image WArping model (named MOWA) in this work. Specifically, we mitigate the difficulty of multi-task learning by disentangling the motion estimation at both the region level and pixel level. To further enable dynamic task-aware image warping, we introduce a lightweight point-based classifier that predicts the task type, serving as prompts to modulate the feature maps for more accurate estimation. To our knowledge, this is the first work that solves multiple practical warping tasks in one single model. Extensive experiments demonstrate that our MOWA, which is trained on six tasks for multiple-in-one image warping, outperforms state-of-the-art task-specific models across most tasks. Moreover, MOWA also exhibits promising potential to generalize into unseen scenes, as evidenced by cross-domain and zero-shot evaluations. The code and more visual results can be found on the project page: https://kangliao929.github.io/projects/mowa/.

Problem

Research questions and friction points this paper is trying to address.

Solves multiple image warping tasks with one model

Generalizes across different camera models and manipulations

Uses multi-level motion estimation for accurate warping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Disentangles motion at region and pixel levels

Uses lightweight classifier for dynamic task-aware warping

Single model handles multiple warping tasks effectively

🔎 Similar Papers

No similar papers found.