IntrinsicEdit: Precise generative image manipulation in intrinsic space

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current generative diffusion models for image editing support prompt- and semantics-guided manipulation but lack pixel-level precision and are typically constrained to single-task scenarios. To address this, we propose a unified editing framework grounded in an intrinsic-image latent space: the first method to integrate exact diffusion inversion with disentangled intrinsic channels—such as diffuse reflectance, specular reflectance, and surface normals—within an RGB-X diffusion architecture. Our approach enables targeted latent-space editing without fine-tuning or auxiliary data. By operating on physically grounded intrinsic representations, it inherently preserves global illumination consistency and object identity. The framework supports diverse operations including color/texture editing, object insertion/deletion, relighting, and composite edits. Extensive evaluation demonstrates state-of-the-art performance on complex images, achieving superior fidelity, precise controllability, and seamless multi-task compatibility.

Technology Category

Application Category

📝 Abstract
Generative diffusion models have advanced image editing with high-quality results and intuitive interfaces such as prompts and semantic drawing. However, these interfaces lack precise control, and the associated methods typically specialize on a single editing task. We introduce a versatile, generative workflow that operates in an intrinsic-image latent space, enabling semantic, local manipulation with pixel precision for a range of editing operations. Building atop the RGB-X diffusion framework, we address key challenges of identity preservation and intrinsic-channel entanglement. By incorporating exact diffusion inversion and disentangled channel manipulation, we enable precise, efficient editing with automatic resolution of global illumination effects -- all without additional data collection or model fine-tuning. We demonstrate state-of-the-art performance across a variety of tasks on complex images, including color and texture adjustments, object insertion and removal, global relighting, and their combinations.
Problem

Research questions and friction points this paper is trying to address.

Achieving precise generative image manipulation in intrinsic space
Overcoming identity preservation and intrinsic-channel entanglement challenges
Enabling versatile editing without additional data or model fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Operates in intrinsic-image latent space
Uses exact diffusion inversion technique
Disentangles channel manipulation precisely
🔎 Similar Papers
No similar papers found.
L
Linjie Lyu
Max-Planck-Institute for Informatics, Saarland Informatics Campus, Germany and Adobe Research, UK
Valentin Deschaintre
Valentin Deschaintre
Research Scientist, Adobe
Computer GraphicsInverse rendering3D Content Generation & Editing
Yannick Hold-Geoffroy
Yannick Hold-Geoffroy
Senior Research Scientist, Adobe
Computer VisionMachine Learning
M
Milovs Havsan
Adobe Research, USA
Jae Shin Yoon
Jae Shin Yoon
Adobe inc.
3D VisionGraphicsComputer VisionRobot Vision
T
Thomas Leimkuhler
Max-Planck-Institute for Informatics, Saarland Informatics Campus, Germany
C
C. Theobalt
Max-Planck-Institute for Informatics, Saarland Informatics Campus, Germany
Iliyan Georgiev
Iliyan Georgiev
Adobe Research
Computer GraphicsGlobal IlluminationRay TracingMonte CarloStochastic Sampling