Question-to-Knowledge: Multi-Agent Generation of Inspectable Facts for Product Mapping

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Cross-platform SKU matching in e-commerce faces challenges including missing identifiers, substantial naming heterogeneity, and frequent neglect of fine-grained attributes (e.g., brand, specifications, bundle configurations), rendering conventional rule- and keyword-based methods inaccurate. This paper proposes a multi-agent collaborative large language model framework: a reasoning agent generates verifiable factual statements; a knowledge retrieval agent enriches domain-specific context; and a deduplication agent reuses validated reasoning paths. An interactive human-in-the-loop feedback mechanism further refines uncertain cases. The approach significantly improves both matching accuracy and interpretability. Evaluated on a real-world consumer goods dataset, it outperforms strong baselines—particularly excelling in complex scenarios such as brand provenance tracing and composite product identification—demonstrating superior robustness and generalizability.

Technology Category

Application Category

📝 Abstract

Identifying whether two product listings refer to the same Stock Keeping Unit (SKU) is a persistent challenge in ecommerce, especially when explicit identifiers are missing and product names vary widely across platforms. Rule based heuristics and keyword similarity often misclassify products by overlooking subtle distinctions in brand, specification, or bundle configuration. To overcome these limitations, we propose Question to Knowledge (Q2K), a multi agent framework that leverages Large Language Models (LLMs) for reliable SKU mapping. Q2K integrates: (1) a Reasoning Agent that generates targeted disambiguation questions, (2) a Knowledge Agent that resolves them via focused web searches, and (3) a Deduplication Agent that reuses validated reasoning traces to reduce redundancy and ensure consistency. A human in the loop mechanism further refines uncertain cases. Experiments on real world consumer goods datasets show that Q2K surpasses strong baselines, achieving higher accuracy and robustness in difficult scenarios such as bundle identification and brand origin disambiguation. By reusing retrieved reasoning instead of issuing repeated searches, Q2K balances accuracy with efficiency, offering a scalable and interpretable solution for product integration.

Problem

Research questions and friction points this paper is trying to address.

Identifying identical SKU product listings without explicit identifiers

Overcoming rule-based heuristic limitations in product classification

Resolving ambiguous product variations across ecommerce platforms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework using LLMs for SKU mapping

Generates targeted questions and performs web searches

Reuses validated reasoning traces to reduce redundancy

🔎 Similar Papers

CuriousLLM: Elevating Multi-Document Question Answering with LLM-Enhanced Knowledge Graph Reasoning