๐ค AI Summary
Existing virtual try-on methods struggle to model occlusion relationships among multiple clothing layers, often resulting in unnatural layering effects. This work proposes the first framework specifically designed for multi-layer virtual try-on, explicitly modeling inter-layer occlusions and achieving realistic garment fitting through a novel occlusion learning module and a Stable Diffusionโbased deformation alignment mechanism. The study introduces the first multi-layer garment dataset, MLG, and a new evaluation metric, Layered Appearance Consistency Distance (LACD). Extensive experiments on MLG demonstrate that the proposed method significantly outperforms existing approaches in both visual realism and layering consistency, establishing a new state of the art for multi-layer virtual try-on.
๐ Abstract
Existing image-based virtual try-on (VTON) methods primarily focus on single-layer or multi-garment VTON, neglecting multi-layer VTON (ML-VTON), which involves dressing multiple layers of garments onto the human body with realistic deformation and layering to generate visually plausible outcomes. The main challenge lies in accurately modeling occlusion relationships between inner and outer garments to reduce interference from redundant inner garment features. To address this, we propose GO-MLVTON, the first multi-layer VTON method, introducing the Garment Occlusion Learning module to learn occlusion relationships and the StableDiffusion-based Garment Morphing&Fitting module to deform and fit garments onto the human body, producing high-quality multi-layer try-on results. Additionally, we present the MLG dataset for this task and propose a new metric named Layered Appearance Coherence Difference (LACD) for evaluation. Extensive experiments demonstrate the state-of-the-art performance of GO-MLVTON. Project page: https://upyuyang.github.io/go-mlvton/.