Parametric Shadow Control for Portrait Generationin Text-to-Image Diffusion Models

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Text-to-image diffusion models lack intuitive, fine-grained control over shadow shape, position, and intensity in portrait generation; existing editing approaches either rely on costly real-world light-field data or suffer from high computational overhead and poor generalization. Method: We propose Shadow Director—a novel framework that, for the first time, decouples and parameterizes shadow attributes directly within the latent space of pre-trained diffusion models. It employs a lightweight shadow estimation network, feature redirection, and a parameterized control mechanism, trained exclusively on a small synthetic dataset (thousands of images) in just a few hours. Contribution/Results: Shadow Director enables real-time, identity-preserving, cross-style shadow editing without requiring retraining. It achieves superior generalization across diverse portrait styles, reduces training cost by two orders of magnitude, and significantly enhances artistic fidelity and controllability.

Technology Category

Application Category

📝 Abstract

Text-to-image diffusion models excel at generating diverse portraits, but lack intuitive shadow control. Existing editing approaches, as post-processing, struggle to offer effective manipulation across diverse styles. Additionally, these methods either rely on expensive real-world light-stage data collection or require extensive computational resources for training. To address these limitations, we introduce Shadow Director, a method that extracts and manipulates hidden shadow attributes within well-trained diffusion models. Our approach uses a small estimation network that requires only a few thousand synthetic images and hours of training-no costly real-world light-stage data needed. Shadow Director enables parametric and intuitive control over shadow shape, placement, and intensity during portrait generation while preserving artistic integrity and identity across diverse styles. Despite training only on synthetic data built on real-world identities, it generalizes effectively to generated portraits with diverse styles, making it a more accessible and resource-friendly solution.

Problem

Research questions and friction points this paper is trying to address.

Lack intuitive shadow control in text-to-image diffusion models

Existing methods require expensive data or computational resources

Need for preserving artistic integrity across diverse portrait styles

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts hidden shadow attributes in diffusion models

Uses small network with synthetic data only

Enables parametric control over shadow properties

🔎 Similar Papers

No similar papers found.