DualFit: A Two-Stage Virtual Try-On via Warping and Synthesis

📅 2025-08-16

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Virtual try-on methods often fail to faithfully preserve high-frequency garment details (e.g., logos, prints), undermining brand representation and user trust. To address this, we propose DualFit—a two-stage high-fidelity virtual try-on framework. In Stage I, a learned optical flow-based deformation network achieves precise garment-to-body alignment. In Stage II, a region-aware inpainting mechanism—incorporating semantic preservation masks and diffusion-based image completion—is applied exclusively to non-critical regions, thereby strictly preserving high-frequency structural details (e.g., logos) in their original form. DualFit is the first approach to jointly model structured semantic preservation and generative synthesis, significantly enhancing detail integrity and visual naturalness. Extensive experiments demonstrate that DualFit substantially outperforms state-of-the-art methods across multiple benchmarks, particularly in logo/print fidelity, boundary blending quality, and overall perceptual realism.

Technology Category

Application Category

📝 Abstract

Virtual Try-On technology has garnered significant attention for its potential to transform the online fashion retail experience by allowing users to visualize how garments would look on them without physical trials. While recent advances in diffusion-based warping-free methods have improved perceptual quality, they often fail to preserve fine-grained garment details such as logos and printed text elements that are critical for brand integrity and customer trust. In this work, we propose DualFit, a hybrid VTON pipeline that addresses this limitation by two-stage approach. In the first stage, DualFit warps the target garment to align with the person image using a learned flow field, ensuring high-fidelity preservation. In the second stage, a fidelity-preserving try-on module synthesizes the final output by blending the warped garment with preserved human regions. Particularly, to guide this process, we introduce a preserved-region input and an inpainting mask, enabling the model to retain key areas and regenerate only where necessary, particularly around garment seams. Extensive qualitative results show that DualFit achieves visually seamless try-on results while faithfully maintaining high-frequency garment details, striking an effective balance between reconstruction accuracy and perceptual realism.

Problem

Research questions and friction points this paper is trying to address.

Preserve fine-grained garment details like logos and text

Align target garment with person image accurately

Balance reconstruction accuracy and perceptual realism

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage warping and synthesis pipeline

Learned flow field for garment alignment

Preserved-region input and inpainting mask

🔎 Similar Papers

No similar papers found.