IntrinsicEdit: Precise generative image manipulation in intrinsic space

📅 2025-05-13

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Current generative diffusion models for image editing support prompt- and semantics-guided manipulation but lack pixel-level precision and are typically constrained to single-task scenarios. To address this, we propose a unified editing framework grounded in an intrinsic-image latent space: the first method to integrate exact diffusion inversion with disentangled intrinsic channels—such as diffuse reflectance, specular reflectance, and surface normals—within an RGB-X diffusion architecture. Our approach enables targeted latent-space editing without fine-tuning or auxiliary data. By operating on physically grounded intrinsic representations, it inherently preserves global illumination consistency and object identity. The framework supports diverse operations including color/texture editing, object insertion/deletion, relighting, and composite edits. Extensive evaluation demonstrates state-of-the-art performance on complex images, achieving superior fidelity, precise controllability, and seamless multi-task compatibility.

Technology Category

Application Category

📝 Abstract

Generative diffusion models have advanced image editing with high-quality results and intuitive interfaces such as prompts and semantic drawing. However, these interfaces lack precise control, and the associated methods typically specialize on a single editing task. We introduce a versatile, generative workflow that operates in an intrinsic-image latent space, enabling semantic, local manipulation with pixel precision for a range of editing operations. Building atop the RGB-X diffusion framework, we address key challenges of identity preservation and intrinsic-channel entanglement. By incorporating exact diffusion inversion and disentangled channel manipulation, we enable precise, efficient editing with automatic resolution of global illumination effects -- all without additional data collection or model fine-tuning. We demonstrate state-of-the-art performance across a variety of tasks on complex images, including color and texture adjustments, object insertion and removal, global relighting, and their combinations.

Problem

Research questions and friction points this paper is trying to address.

Achieving precise generative image manipulation in intrinsic space

Overcoming identity preservation and intrinsic-channel entanglement challenges

Enabling versatile editing without additional data or model fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Operates in intrinsic-image latent space

Uses exact diffusion inversion technique

Disentangles channel manipulation precisely

🔎 Similar Papers

No similar papers found.