FW-VTON: Flattening-and-Warping for Person-to-Person Virtual Try-on

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the novel task of person-to-person virtual try-on—generating photorealistic synthesis of a target person wearing a garment from an image of another person wearing it, given only the target person’s image and the source garment image. We propose the first end-to-end, three-stage Flattening-and-Warping framework: (1) segmenting and flattening the garment from the source image into a canonical 2D layout; (2) pose-guided warping to align the flattened garment with the target person’s body geometry; and (3) detail-preserving fusion and adversarial refinement for enhanced realism. To support this task, we introduce the first high-quality person-to-person virtual try-on benchmark dataset. Our method significantly outperforms existing approaches in garment segmentation accuracy, warping robustness under diverse poses, and overall visual fidelity, achieving state-of-the-art performance across multiple quantitative metrics and qualitative evaluations.

Technology Category

Application Category

📝 Abstract
Traditional virtual try-on methods primarily focus on the garment-to-person try-on task, which requires flat garment representations. In contrast, this paper introduces a novel approach to the person-to-person try-on task. Unlike the garment-to-person try-on task, the person-to-person task only involves two input images: one depicting the target person and the other showing the garment worn by a different individual. The goal is to generate a realistic combination of the target person with the desired garment. To this end, we propose Flattening-and-Warping Virtual Try-On ( extbf{FW-VTON}), a method that operates in three stages: (1) extracting the flattened garment image from the source image; (2) warping the garment to align with the target pose; and (3) integrating the warped garment seamlessly onto the target person. To overcome the challenges posed by the lack of high-quality datasets for this task, we introduce a new dataset specifically designed for person-to-person try-on scenarios. Experimental evaluations demonstrate that FW-VTON achieves state-of-the-art performance, with superior results in both qualitative and quantitative assessments, and also excels in garment extraction subtasks.
Problem

Research questions and friction points this paper is trying to address.

Develops person-to-person virtual try-on using two images
Proposes FW-VTON for garment flattening, warping, and integration
Introduces new dataset for person-to-person try-on scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts flattened garment from source image
Warps garment to match target pose
Integrates garment seamlessly onto target
🔎 Similar Papers
No similar papers found.
Z
Zheng Wang
Shanghai Jiao Tong University, China
X
Xianbing Sun
Shanghai Jiao Tong University, China
S
Shengyi Wu
Shanghai Jiao Tong University, China
J
Jiahui Zhan
Shanghai Jiao Tong University, China
Jianlou Si
Jianlou Si
alibaba-inc.com
MLLMGenAIAGIEmbodied AI
C
Chi Zhang
TeleAI, China Telecom, China
Liqing Zhang
Liqing Zhang
Professor @ Computer Science, Virginia Tech
Bioinformaticsdata analyticsmachine learning
Jianfu Zhang
Jianfu Zhang
Shanghai Jiao Tong University
Machine LearningComputer Vision