A Semi-supervised Scalable Unified Framework for E-commerce Query Classification

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
E-commerce query classification faces three key challenges: (1) sparse prior information due to short query texts, (2) isolated labels limiting exploitation of semantic and hierarchical label relationships, and (3) overreliance on user posterior click feedback, exacerbating the Matthew effect. To address these, we propose the first unified, scalable semi-supervised framework featuring three orthogonal yet synergistic enhancements: knowledge enhancement (integrating external knowledge graphs), label enhancement (modeling semantic correlations and hierarchical structures among labels), and structural enhancement (a modular, plug-and-play architecture). This design explicitly decouples and jointly optimizes subtasks—including intent identification and category prediction—while substantially reducing dependence on noisy posterior labels. Offline evaluations across multiple benchmarks surpass state-of-the-art methods; online A/B tests demonstrate statistically significant improvements in core business metrics—including CTR and GMV—validating the framework’s effectiveness, robustness, and practical deployability.

Technology Category

Application Category

📝 Abstract
Query classification, including multiple subtasks such as intent and category prediction, is vital to e-commerce applications. E-commerce queries are usually short and lack context, and the information between labels cannot be used, resulting in insufficient prior information for modeling. Most existing industrial query classification methods rely on users' posterior click behavior to construct training samples, resulting in a Matthew vicious cycle. Furthermore, the subtasks of query classification lack a unified framework, leading to low efficiency for algorithm optimization. In this paper, we propose a novel Semi-supervised Scalable Unified Framework (SSUF), containing multiple enhanced modules to unify the query classification tasks. The knowledge-enhanced module uses world knowledge to enhance query representations and solve the problem of insufficient query information. The label-enhanced module uses label semantics and semi-supervised signals to reduce the dependence on posterior labels. The structure-enhanced module enhances the label representation based on the complex label relations. Each module is highly pluggable, and input features can be added or removed as needed according to each subtask. We conduct extensive offline and online A/B experiments, and the results show that SSUF significantly outperforms the state-of-the-art models.
Problem

Research questions and friction points this paper is trying to address.

Classify e-commerce queries with insufficient prior information
Reduce reliance on user click behavior for training
Unify multiple subtasks into a single efficient framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge-enhanced module improves query representations
Label-enhanced module reduces posterior label dependency
Structure-enhanced module optimizes label relations
🔎 Similar Papers
No similar papers found.
Chunyuan Yuan
Chunyuan Yuan
Ph.D., UCAS China
NLP & IR
C
Chong Zhang
JD.COM
Z
Zheng Fang
JD.COM
M
Ming Pang
JD.COM
X
Xue Jiang
JD.COM
C
Changping Peng
JD.COM
Z
Zhangang Lin
JD.COM
Ching Law
Ching Law
MIT