MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing 4D human datasets commonly lack realistic garment dynamics or fine-grained annotations and paired data tailored for virtual try-on and size estimation in fashion research. To address this gap, this work introduces MV-Fashion, a large-scale multi-view video dataset comprising 80 subjects, 3,273 sequences (totaling 72.5 million frames), captured with synchronized multi-layer clothing configurations alongside corresponding flat-lay product images. The dataset provides pixel-level semantic segmentation, fabric elasticity attributes, and 3D point clouds. MV-Fashion is the first to enable, under real-world conditions, high-fidelity multi-view, multi-layer outfit capture with precise annotations and aligned flat-to-worn image pairs, thereby supporting tasks such as virtual try-on, size estimation, and novel view synthesis, for which it also establishes baseline benchmarks.

Technology Category

Application Category

📝 Abstract

Existing 4D human datasets fall short for fashion-specific research, lacking either realistic garment dynamics or task-specific annotations. Synthetic datasets suffer from a realism gap, whereas real-world captures lack the detailed annotations and paired data required for virtual try-on (VTON) and size estimation tasks. To bridge this gap, we introduce MV-Fashion, a large-scale, multi-view video dataset engineered for domain-specific fashion analysis. MV-Fashion features 3,273 sequences (72.5 million frames) from 80 diverse subjects wearing 3-10 outfits each. It is designed to capture complex, real-world garment dynamics, including multiple layers and varied styling (e.g. rolled sleeves, tucked shirt). A core contribution is a rich data representation that includes pixel-level semantic annotations, ground-truth material properties like elasticity, and 3D point clouds. Crucially for VTON applications, MV-Fashion provides paired data: multi-view synchronized captures of worn garments alongside their corresponding flat, catalogue images. We leverage this dataset to establish baselines for fashion-centric tasks, including virtual try-on, clothing size estimation, and novel view synthesis. The dataset is available at https://hunorlaczko.github.io/MV-Fashion .

Problem

Research questions and friction points this paper is trying to address.

virtual try-on

size estimation

fashion dataset

garment dynamics

paired data

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-view paired data

virtual try-on

garment dynamics