FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding

📅 2026-04-10

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Existing fashion datasets are often fragmented and limited to single tasks, hindering expert-level holistic understanding of style, occasion, and outfit coordination logic. To address this, this work introduces a multimodal benchmark dataset annotated by fashion experts, featuring fine-grained semantic labels for both individual garments and complete outfits. It further proposes the first expert knowledge–driven unified framework for fashion understanding, enabling three core tasks: outfit-to-item grounding, outfit completion, and semantic evaluation. By integrating expert annotation protocols, multimodal large language model training, and context-aware compatibility modeling, the proposed approach achieves significant performance gains across multiple tasks, demonstrating the dataset’s effectiveness as both a unified benchmark and a valuable training resource.

Technology Category

Application Category

📝 Abstract

Fashion understanding requires both visual perception and expert-level reasoning about style, occasion, compatibility, and outfit rationale. However, existing fashion datasets remain fragmented and task-specific, often focusing on item attributes, outfit co-occurrence, or weak textual supervision, and thus provide limited support for holistic outfit understanding. In this paper, we introduce FashionStylist, an expert-annotated benchmark for holistic and expert-level fashion understanding. Constructed through a dedicated fashion-expert annotation pipeline, FashionStylist provides professionally grounded annotations at both the item and outfit levels. It supports three representative tasks: outfit-to-item grounding, outfit completion, and outfit evaluation. These tasks cover realistic item recovery from complex outfits with layering and accessories, compatibility-aware composition beyond co-occurrence matching, and expert-level assessment of style, season, occasion, and overall coherence. Experimental results show that FashionStylist serves not only as a unified benchmark for multiple fashion tasks, but also as an effective training resource for improving grounding, completion, and outfit-level semantic evaluation in MLLM-based fashion systems.

Problem

Research questions and friction points this paper is trying to address.

fashion understanding

multimodal dataset

expert knowledge

holistic outfit understanding

fashion benchmark

Innovation

Methods, ideas, or system contributions that make the work stand out.

expert-annotated dataset

holistic fashion understanding

outfit grounding