Parametric Shadow Control for Portrait Generationin Text-to-Image Diffusion Models

πŸ“… 2025-03-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Text-to-image diffusion models lack intuitive, fine-grained control over shadow shape, position, and intensity in portrait generation; existing editing approaches either rely on costly real-world light-field data or suffer from high computational overhead and poor generalization. Method: We propose Shadow Directorβ€”a novel framework that, for the first time, decouples and parameterizes shadow attributes directly within the latent space of pre-trained diffusion models. It employs a lightweight shadow estimation network, feature redirection, and a parameterized control mechanism, trained exclusively on a small synthetic dataset (thousands of images) in just a few hours. Contribution/Results: Shadow Director enables real-time, identity-preserving, cross-style shadow editing without requiring retraining. It achieves superior generalization across diverse portrait styles, reduces training cost by two orders of magnitude, and significantly enhances artistic fidelity and controllability.

Technology Category

Application Category

πŸ“ Abstract
Text-to-image diffusion models excel at generating diverse portraits, but lack intuitive shadow control. Existing editing approaches, as post-processing, struggle to offer effective manipulation across diverse styles. Additionally, these methods either rely on expensive real-world light-stage data collection or require extensive computational resources for training. To address these limitations, we introduce Shadow Director, a method that extracts and manipulates hidden shadow attributes within well-trained diffusion models. Our approach uses a small estimation network that requires only a few thousand synthetic images and hours of training-no costly real-world light-stage data needed. Shadow Director enables parametric and intuitive control over shadow shape, placement, and intensity during portrait generation while preserving artistic integrity and identity across diverse styles. Despite training only on synthetic data built on real-world identities, it generalizes effectively to generated portraits with diverse styles, making it a more accessible and resource-friendly solution.
Problem

Research questions and friction points this paper is trying to address.

Lack intuitive shadow control in text-to-image diffusion models
Existing methods require expensive data or computational resources
Need for preserving artistic integrity across diverse portrait styles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts hidden shadow attributes in diffusion models
Uses small network with synthetic data only
Enables parametric control over shadow properties
πŸ”Ž Similar Papers
No similar papers found.