🤖 AI Summary
Existing virtual try-on methods struggle to simultaneously preserve garment detail fidelity and ensure natural human-clothing integration: explicit deformation approaches often introduce geometric distortions, while implicit generative methods tend to blur fine textures and structural details. To address this trade-off, we propose a hybrid framework featuring synergistic explicit-implicit dual pathways. Our method introduces a novel preprocessing-guided dynamic region fusion mechanism that tightly couples geometry-based explicit deformation with diffusion-model-driven implicit reconstruction. We further incorporate cross-attention alignment and region-adaptive weighting to enhance spatial coherence and local detail preservation. The framework achieves high-fidelity texture and structural rendering while significantly improving visual naturalness of the dressed human body. Quantitative and qualitative evaluations demonstrate that our approach surpasses state-of-the-art diffusion-based methods in detail sharpness and outperforms top explicit methods in perceptual realism, establishing new SOTA performance across standard benchmarks.
📝 Abstract
Virtual try-on systems have significant potential in e-commerce, allowing customers to visualize garments on themselves. Existing image-based methods fall into two categories: those that directly warp garment-images onto person-images (explicit warping), and those using cross-attention to reconstruct given garments (implicit warping). Explicit warping preserves garment details but often produces unrealistic output, while implicit warping achieves natural reconstruction but struggles with fine details. We propose HYB-VITON, a novel approach that combines the advantages of each method and includes both a preprocessing pipeline for warped garments and a novel training option. These components allow us to utilize beneficial regions of explicitly warped garments while leveraging the natural reconstruction of implicit warping. A series of experiments demonstrates that HYB-VITON preserves garment details more faithfully than recent diffusion-based methods, while producing more realistic results than a state-of-the-art explicit warping method.