Thinking Broad, Acting Fast: Latent Reasoning Distillation from Multi-Perspective Chain-of-Thought for E-Commerce Relevance

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing e-commerce search relevance models, which struggle with long-tail and ambiguous queries and fail to comprehensively capture multi-dimensional relevance—such as user intent, attribute matching, and business rules—due to reliance on single-perspective chain-of-thought reasoning. Moreover, large models suffer from high inference latency, and conventional knowledge distillation discards reasoning structure during deployment, compromising both interpretability and reasoning capability. To overcome these challenges, we propose a Multi-Perspective Chain-of-Thought (MPCoT) framework that integrates supervised fine-tuning and direct preference optimization to construct a strong teacher model, along with a Latent Reasoning Knowledge Distillation (LRKD) mechanism that enables a lightweight student model to retain structured reasoning ability under low-latency constraints. Evaluated on a production advertising platform with tens of millions of daily active users, our approach significantly improves offline metrics and demonstrates dual gains in business performance and user experience through A/B testing.

Technology Category

Application Category

📝 Abstract
Effective relevance modeling is crucial for e-commerce search, as it aligns search results with user intent and enhances customer experience. Recent work has leveraged large language models (LLMs) to address the limitations of traditional relevance models, especially for long-tail and ambiguous queries. By incorporating Chain-of-Thought (CoT) reasoning, these approaches improve both accuracy and interpretability through multi-step reasoning. However, two key limitations remain: (1) most existing approaches rely on single-perspective CoT reasoning, which fails to capture the multifaceted nature of e-commerce relevance (e.g., user intent vs. attribute-level matching vs. business-specific rules); and (2) although CoT-enhanced LLM's offer rich reasoning capabilities, their high inference latency necessitates knowledge distillation for real-time deployment, yet current distillation methods discard the CoT rationale structure at inference, using it as a transient auxiliary signal and forfeiting its reasoning utility. To address these challenges, we propose a novel framework that better exploits CoT semantics throughout the optimization pipeline. Specifically, the teacher model leverages Multi-Perspective CoT (MPCoT) to generate diverse rationales and combines Supervised Fine-Tuning (SFT) with Direct Preference Optimization (DPO) to construct a more robust reasoner. For distillation, we introduce Latent Reasoning Knowledge Distillation (LRKD), which endows a student model with a lightweight inference-time latent reasoning extractor, allowing efficient and low-latency internalization of the LLM's sophisticated reasoning capabilities. Evaluated in offline experiments and online A/B tests on an e-commerce search advertising platform serving tens of millions of users daily, our method delivers significant offline gains, showing clear benefits in both commercial performance and user experience.
Problem

Research questions and friction points this paper is trying to address.

e-commerce relevance
Chain-of-Thought
multi-perspective reasoning
knowledge distillation
inference latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Perspective Chain-of-Thought
Latent Reasoning Knowledge Distillation
LLM Distillation
E-Commerce Relevance
Direct Preference Optimization
🔎 Similar Papers
No similar papers found.
B
Baopu Qiu
Alibaba International Digital Commerce Group
H
Hao Chen
Alibaba International Digital Commerce Group
Y
Yuanrong Wu
Zhejiang University
C
Changtong Zan
Alibaba International Digital Commerce Group
Chao Wei
Chao Wei
Qualcomm, nokia, nokia siemens networks
Wireless communications
W
Weiru Zhang
Alibaba International Digital Commerce Group
X
Xiaoyi Zeng
Alibaba International Digital Commerce Group