Omni$^2$: Unifying Omnidirectional Image Generation and Editing in an Omni Model

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This paper addresses key challenges in 360° omnidirectional image (ODI) generation and editing—namely, spherical geometric distortion, difficulty in modeling wide-field-of-view content, and task fragmentation—by proposing Omni, the first unified single-model framework for both ODI generation and editing. Methodologically, we introduce Any2Omni, a large-scale multi-task benchmark comprising over 60,000 samples across nine diverse tasks, and design an end-to-end diffusion model integrating spherical coordinate-aware representations, multimodal conditional encoding, and a shared architecture. Our contributions are threefold: (1) the first unified modeling of ODI generation and editing within a single framework; (2) overcoming the geometric adaptation bottleneck of conventional 2D models on spherical manifolds; and (3) achieving significant improvements over state-of-the-art methods across multiple tasks, demonstrating strong generalization capability and geometric consistency.

Technology Category

Application Category

📝 Abstract

$360^{circ}$ omnidirectional images (ODIs) have gained considerable attention recently, and are widely used in various virtual reality (VR) and augmented reality (AR) applications. However, capturing such images is expensive and requires specialized equipment, making ODI synthesis increasingly important. While common 2D image generation and editing methods are rapidly advancing, these models struggle to deliver satisfactory results when generating or editing ODIs due to the unique format and broad 360$^{circ}$ Field-of-View (FoV) of ODIs. To bridge this gap, we construct extbf{ extit{Any2Omni}}, the first comprehensive ODI generation-editing dataset comprises 60,000+ training data covering diverse input conditions and up to 9 ODI generation and editing tasks. Built upon Any2Omni, we propose an extbf{underline{Omni}} model for extbf{underline{Omni}}-directional image generation and editing ( extbf{ extit{Omni$^2$}}), with the capability of handling various ODI generation and editing tasks under diverse input conditions using one model. Extensive experiments demonstrate the superiority and effectiveness of the proposed Omni$^2$ model for both the ODI generation and editing tasks.

Problem

Research questions and friction points this paper is trying to address.

Generating high-quality omnidirectional images without specialized equipment

Editing 360-degree images effectively with existing 2D methods limitations

Unifying diverse ODI tasks into a single adaptable model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive dataset Any2Omni for ODI tasks

Omni model handles diverse generation and editing

Unified solution for 360-degree image processing

🔎 Similar Papers

Multimodal Conditional 3D Face Geometry Generation