FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing

📅 2026-04-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
This study addresses the lack of interpretability in existing fashion AI systems, which implicitly encode cultural logics—such as brand identity, editorial preferences, and historical aesthetics—as opaque signals. To remedy this, the authors propose FASH-iCNN, a multimodal convolutional neural network that treats editorial fashion culture as a core signal rather than noise. Trained on 87,547 runway images from *Vogue*, the model jointly models garment regions and incorporates visual channel ablation analysis (color, texture, luminance) to enable interpretable recognition of brands, decades, and color traditions. The model achieves 78.2% accuracy in brand identification across 14 labels and 88.6% accuracy in decade classification, with a mean year prediction error of only 2.2 years. Ablation studies reveal texture as the most critical cue for brand recognition, whose removal reduces accuracy by 37.6 percentage points—substantially more than the impact of color.
📝 Abstract
Fashion AI systems routinely encode the aesthetic logic of specific houses, editors, and historical moments without disclosing it. We present FASH-iCNN, a multimodal system trained on 87,547 Vogue runway images across 15 fashion houses spanning 1991-2024 that makes this cultural logic inspectable. Given a photograph of a garment, the system recovers which house produced it, which era it belongs to, and which color tradition it reflects. A clothing-only model identifies the fashion house at 78.2% top-1 across 14 houses, the decade at 88.6% top-1, and the specific year at 58.3% top-1 across 34 years with a mean error of just 2.2 years. Probing which visual channels carry this signal reveals a sharp dissociation: removing color costs only 10.6pp of house identity accuracy, while removing texture costs 37.6pp, establishing texture and luminance as the primary carriers of editorial identity. FASH-iCNN treats editorial culture as the signal rather than background noise, identifying which houses, eras, and color traditions shaped each output so that users can see not just what the system predicts but which houses, editors, and historical moments are encoded in that prediction.
Problem

Research questions and friction points this paper is trying to address.

Fashion AI
editorial identity
cultural logic
interpretability
multimodal learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal CNN
fashion identity probing
texture dominance
editorial culture as signal
interpretable fashion AI
🔎 Similar Papers
No similar papers found.