DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Single-image 3D hair reconstruction faces challenges including high hairstyle diversity, scarcity of real-world paired data, reliance on low-dimensional intermediate representations, and post-processing—limiting modeling of complex curly hairstyles (e.g., Afro). This paper proposes the first end-to-end, strand-level generative framework based on a diffusion Transformer, eliminating guided hair bundle initialization and post-hoc refinement. We introduce the largest synthetic 3D hairstyle dataset to date (40K samples), coupled with a scalp-texture-mapped latent representation and a pre-trained vision backbone to enable zero-shot generalization. Our method directly synthesizes high-fidelity, geometrically accurate individual hair strands from a single frontal image. On real images, it significantly improves curliness, density, and structural integrity over prior work. Crucially, it achieves full strand-level reconstruction without upsampling or explicit decoding—marking the first such result in the literature.

Technology Category

Application Category

📝 Abstract

We address the task of generating 3D hair geometry from a single image, which is challenging due to the diversity of hairstyles and the lack of paired image-to-3D hair data. Previous methods are primarily trained on synthetic data and cope with the limited amount of such data by using low-dimensional intermediate representations, such as guide strands and scalp-level embeddings, that require post-processing to decode, upsample, and add realism. These approaches fail to reconstruct detailed hair, struggle with curly hair, or are limited to handling only a few hairstyles. To overcome these limitations, we propose DiffLocks, a novel framework that enables detailed reconstruction of a wide variety of hairstyles directly from a single image. First, we address the lack of 3D hair data by automating the creation of the largest synthetic hair dataset to date, containing 40K hairstyles. Second, we leverage the synthetic hair dataset to learn an image-conditioned diffusion-transfomer model that generates accurate 3D strands from a single frontal image. By using a pretrained image backbone, our method generalizes to in-the-wild images despite being trained only on synthetic data. Our diffusion model predicts a scalp texture map in which any point in the map contains the latent code for an individual hair strand. These codes are directly decoded to 3D strands without post-processing techniques. Representing individual strands, instead of guide strands, enables the transformer to model the detailed spatial structure of complex hairstyles. With this, DiffLocks can recover highly curled hair, like afro hairstyles, from a single image for the first time. Data and code is available at https://radualexandru.github.io/difflocks/

Problem

Research questions and friction points this paper is trying to address.

Generating detailed 3D hair from single images

Overcoming lack of paired image-to-3D hair data

Reconstructing diverse hairstyles including curly hair

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated creation of largest synthetic hair dataset

Image-conditioned diffusion-transformer for 3D strands

Scalp texture map decoding without post-processing

🔎 Similar Papers

No similar papers found.