๐ค AI Summary
This work addresses the challenge of fine-grained similarity assessment in e-commerce entity search, where relevance depends on both product categories and contextual cuesโa task poorly handled by conventional embedding methods due to their inability to model attribute correlations. The authors propose a two-stage zero-shot ranking approach: offline, a large language model (LLM) constructs a category-aware, structured attribute graph; online, a graph-augmented LLM efficiently ranks candidate entities by reasoning over this reusable graph. This is the first method to integrate a precomputed attribute graph with an LLM for zero-shot ranking without any training data. Experiments demonstrate that the approach outperforms baselines by over 5% in mean average precision, reduces per-item inference tokens by 57%, and exhibits strong cross-category generalization and practical deployment viability.
๐ Abstract
Entity search, i.e., finding the most similar entities to a query entity, faces unique challenges in e-commerce, where product similarity varies across categories and contexts. Traditional embedding-based approaches often struggle to capture nuanced context-specific attribute relevance. In this paper, we present a two-stage approach combining Large Language Model (LLM)-driven attribute graph construction with graph-aware LLM ranking. In the offline stage, we extract structured product attributes from unstructured text, and construct a reusable attribute graph with category-aware schemas. In the online stage, we rank retrieved candidates by reasoning over this structured representation rather than raw text, reducing per-product token usage by 57% while improving ranking precision. Experiments show that our approach outperforms multiple baselines under zero-shot scenarios, achieving a over 5% improvement in average precision without requiring training data, generalizes robustly across diverse product categories, and shows immense potential for real-world deployment.