Question-to-Knowledge: Multi-Agent Generation of Inspectable Facts for Product Mapping

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-platform SKU matching in e-commerce faces challenges including missing identifiers, substantial naming heterogeneity, and frequent neglect of fine-grained attributes (e.g., brand, specifications, bundle configurations), rendering conventional rule- and keyword-based methods inaccurate. This paper proposes a multi-agent collaborative large language model framework: a reasoning agent generates verifiable factual statements; a knowledge retrieval agent enriches domain-specific context; and a deduplication agent reuses validated reasoning paths. An interactive human-in-the-loop feedback mechanism further refines uncertain cases. The approach significantly improves both matching accuracy and interpretability. Evaluated on a real-world consumer goods dataset, it outperforms strong baselines—particularly excelling in complex scenarios such as brand provenance tracing and composite product identification—demonstrating superior robustness and generalizability.

Technology Category

Application Category

📝 Abstract
Identifying whether two product listings refer to the same Stock Keeping Unit (SKU) is a persistent challenge in ecommerce, especially when explicit identifiers are missing and product names vary widely across platforms. Rule based heuristics and keyword similarity often misclassify products by overlooking subtle distinctions in brand, specification, or bundle configuration. To overcome these limitations, we propose Question to Knowledge (Q2K), a multi agent framework that leverages Large Language Models (LLMs) for reliable SKU mapping. Q2K integrates: (1) a Reasoning Agent that generates targeted disambiguation questions, (2) a Knowledge Agent that resolves them via focused web searches, and (3) a Deduplication Agent that reuses validated reasoning traces to reduce redundancy and ensure consistency. A human in the loop mechanism further refines uncertain cases. Experiments on real world consumer goods datasets show that Q2K surpasses strong baselines, achieving higher accuracy and robustness in difficult scenarios such as bundle identification and brand origin disambiguation. By reusing retrieved reasoning instead of issuing repeated searches, Q2K balances accuracy with efficiency, offering a scalable and interpretable solution for product integration.
Problem

Research questions and friction points this paper is trying to address.

Identifying identical SKU product listings without explicit identifiers
Overcoming rule-based heuristic limitations in product classification
Resolving ambiguous product variations across ecommerce platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework using LLMs for SKU mapping
Generates targeted questions and performs web searches
Reuses validated reasoning traces to reduce redundancy
🔎 Similar Papers
No similar papers found.
Wonduk Seo
Wonduk Seo
PKU Alumni; Enhans
Machine LearningText MiningInformation RetrievalSocial ComputingBioinformatics
T
Taesub Shin
AI Research, Enhans, Seoul, South Korea
H
Hyunjin An
AI Research, Enhans, Seoul, South Korea
Dokyun Kim
Dokyun Kim
AI Research, Enhans, Seoul, South Korea
S
Seunghyun Lee
AI Research, Enhans, Seoul, South Korea