Cross-Platform E-Commerce Product Categorization and Recategorization: A Multimodal Hierarchical Classification Approach

๐Ÿ“… 2025-08-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

215K/year
๐Ÿค– AI Summary
To address the challenge of product categorization caused by platform heterogeneity and inconsistent taxonomy in cross-border e-commerce, this paper proposes a multimodal hierarchical classification framework. First, it fuses textual (RoBERTa), visual (ViT), and multimodal (CLIP) representations via a triple-fusion strategyโ€”early, late, and attention-based fusion. Second, it introduces a self-supervised reclassification pipeline that integrates contrastive learning and cascaded clustering to discover fine-grained novel categories, augmented with a dynamic masking mechanism to preserve hierarchical consistency. Third, it deploys a lightweight two-stage inference architecture. Evaluated on a dataset of 270,000 products, the framework achieves a hierarchical F1-score of 98.59% and clustering purity exceeding 86%, significantly enhancing cross-platform generalization. The solution has been successfully deployed in an industrial business intelligence platform.

Technology Category

Application Category

๐Ÿ“ Abstract
This study addresses critical industrial challenges in e-commerce product categorization, namely platform heterogeneity and the structural limitations of existing taxonomies, by developing and deploying a multimodal hierarchical classification framework. Using a dataset of 271,700 products from 40 international fashion e-commerce platforms, we integrate textual features (RoBERTa), visual features (ViT), and joint vision--language representations (CLIP). We investigate fusion strategies, including early, late, and attention-based fusion within a hierarchical architecture enhanced by dynamic masking to ensure taxonomic consistency. Results show that CLIP embeddings combined via an MLP-based late-fusion strategy achieve the highest hierarchical F1 (98.59%), outperforming unimodal baselines. To address shallow or inconsistent categories, we further introduce a self-supervised ``product recategorization'' pipeline using SimCLR, UMAP, and cascade clustering, which discovered new, fine-grained categories (e.g., subtypes of ``Shoes'') with cluster purities above 86%. Cross-platform experiments reveal a deployment-relevant trade-off: complex late-fusion methods maximize accuracy with diverse training data, while simpler early-fusion methods generalize more effectively to unseen platforms. Finally, we demonstrate the framework's industrial scalability through deployment in EURWEB's commercial transaction intelligence platform via a two-stage inference pipeline, combining a lightweight RoBERTa stage with a GPU--accelerated multimodal stage to balance cost and accuracy.
Problem

Research questions and friction points this paper is trying to address.

Addressing platform heterogeneity in e-commerce product categorization
Overcoming structural limitations of existing product taxonomies
Developing scalable multimodal classification for cross-platform deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal hierarchical classification with RoBERTa, ViT, CLIP
Self-supervised recategorization using SimCLR and clustering
Two-stage inference pipeline balancing cost and accuracy
๐Ÿ”Ž Similar Papers