Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Current text-to-3D human and garment generation methods lack an end-to-end, computer graphics (CG)-ready pipeline, resulting in outputs unsuitable for direct simulation or rendering. To address this, we propose the first unified three-stage framework: (1) LLM-guided semantic parsing for parametric human modeling and template-based garment fitting; (2) topology-preserving deformation with geometric constraint optimization to enhance geometric fidelity; and (3) a symmetric local-attention texture diffusion module ensuring multi-view consistency and fine-grained texture realism. Our method surpasses state-of-the-art approaches across fidelity, controllability, and diversity. On standard benchmarks, it achieves significant improvements in physical simulation compatibility and real-time rendering readiness. Crucially, it enables one-click generation of “text → animatable 3D avatars” — fully rigged, textured, and simulation-ready digital humans.

Technology Category

Application Category

📝 Abstract

Creating detailed 3D human avatars with garments typically requires specialized expertise and labor-intensive processes. Although recent advances in generative AI have enabled text-to-3D human/clothing generation, current methods fall short in offering accessible, integrated pipelines for producing ready-to-use clothed avatars. To solve this, we introduce Tailor, an integrated text-to-avatar system that generates high-fidelity, customizable 3D humans with simulation-ready garments. Our system includes a three-stage pipeline. We first employ a large language model to interpret textual descriptions into parameterized body shapes and semantically matched garment templates. Next, we develop topology-preserving deformation with novel geometric losses to adapt garments precisely to body geometries. Furthermore, an enhanced texture diffusion module with a symmetric local attention mechanism ensures both view consistency and photorealistic details. Quantitative and qualitative evaluations demonstrate that Tailor outperforms existing SoTA methods in terms of fidelity, usability, and diversity. Code will be available for academic use.

Problem

Research questions and friction points this paper is trying to address.

Generates 3D avatars with garments from text descriptions

Integrates body shape and garment template generation

Ensures high-fidelity, customizable, and simulation-ready outputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-to-avatar system with high-fidelity 3D humans

Topology-preserving garment deformation with geometric losses

Enhanced texture diffusion with symmetric local attention

🔎 Similar Papers

Garment3DGen: 3D Garment Stylization and Texture Generation