LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Existing image generation methods struggle to precisely control occlusion relationships among objects: text-guided approaches lack geometric consistency, while layout-to-image methods neglect depth-order modeling. This paper introduces the first occlusion-control framework that operates in the latent space of pretrained diffusion models without fine-tuning, incorporating volumetric rendering principles. By jointly estimating object transmittance and density—and integrating occlusion-aware latent-space manipulations—it enables physically consistent foreground-background reasoning. The method supports fine-grained control over multiple visual attributes, including transparency, fog concentration, and illumination intensity. Evaluated on multiple benchmarks, it achieves significantly higher occlusion accuracy than state-of-the-art methods, while relying solely on off-the-shelf pretrained diffusion models and requiring no additional training.

Technology Category

Application Category

📝 Abstract

We propose a novel training-free image generation algorithm that precisely controls the occlusion relationships between objects in an image. Existing image generation methods typically rely on prompts to influence occlusion, which often lack precision. While layout-to-image methods provide control over object locations, they fail to address occlusion relationships explicitly. Given a pre-trained image diffusion model, our method leverages volume rendering principles to "render" the scene in latent space, guided by occlusion relationships and the estimated transmittance of objects. This approach does not require retraining or fine-tuning the image diffusion model, yet it enables accurate occlusion control due to its physics-grounded foundation. In extensive experiments, our method significantly outperforms existing approaches in terms of occlusion accuracy. Furthermore, we demonstrate that by adjusting the opacities of objects or concepts during rendering, our method can achieve a variety of effects, such as altering the transparency of objects, the density of mass (e.g., forests), the concentration of particles (e.g., rain, fog), the intensity of light, and the strength of lens effects, etc.

Problem

Research questions and friction points this paper is trying to address.

Precisely controls object occlusion in image generation

Eliminates need for retraining diffusion models

Enables diverse effects via opacity adjustment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free latent space rendering for occlusion

Physics-based transmittance guides object occlusion

Adjustable opacity controls diverse visual effects

🔎 Similar Papers

OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering