🤖 AI Summary
Existing relightable digital human methods rely on explicit 3D decomposition and analytical reflectance models, struggling to simultaneously achieve high-fidelity facial expressions and photorealistic relighting. This work proposes D-Rex, a novel framework that introduces diffusion models to digital human relighting for the first time. By leveraging a pretrained video diffusion model to perform image-space illumination transfer from flat-lit renderings, D-Rex fully decouples relighting from avatar modeling. The approach requires no modification to existing flat-lit animation pipelines and supports arbitrary-view, expressive, and temporally coherent HDR relighting. Trained on flat-to-relit image pairs captured with a Light Stage and fine-tuned using LoRA, D-Rex preserves intricate facial expressions and motion details while significantly outperforming physics-based baselines.
📝 Abstract
We present D-Rex, a person-specific framework for photorealistic, relightable, expressive, and animatable full-body human avatars with free-viewpoint rendering. Existing methods for relightable full-body avatars rely on explicit 3D intrinsic decomposition with analytic reflectance models, which require accurate geometry registration and careful optimization to capture realistic light transport effects. This tight coupling of relighting with avatar modeling has hindered expressiveness: to our knowledge, no existing method demonstrates strong facial animation alongside relighting, limiting applicability in telepresence, gaming, and virtual production. We propose to decouple relighting entirely from avatar modeling by treating it as an image-space post-process: a learned translation from flat-lit, albedo-like renderings to a target HDR illumination. To this end, we leverage the strong generative prior of a pre-trained video diffusion relighting model, fine-tuned via LoRA on paired flat-lit and relit frames captured in a light stage. The flat-lit driving frames are produced by an independent expressive full-body avatar framework trained under white-light conditions, requiring no modification to support relighting, making D-Rex directly applicable to any white-light avatar system. We demonstrate that D-Rex enables view- and temporally consistent relighting while faithfully preserving expressive motion and fine-grained facial detail, outperforming physically-based relightable avatar baselines. Project page is https://vcai.mpi-inf.mpg.de/projects/DRex/