🤖 AI Summary
To address limitations in extreme multi-label classification (XMLC) for e-commerce keyword recommendation—including inadequate relevance scoring, insufficient buyer intent modeling, and poor scalability for large-scale inference—this paper proposes a graph-structured keyword recommendation method tailored to advertising scenarios. Instead of adopting conventional end-to-end XMLC paradigms, we construct a heterogeneous item–keyword graph by leveraging token permutations from product titles. We introduce a composite evaluation metric integrating semantic relevance and buyer reach potential. Furthermore, we employ a lightweight graph neural network coupled with an efficient graph traversal algorithm to achieve millisecond-level inference latency. Evaluated in eBay’s production environment, the proposed model significantly outperforms the incumbent system, scales to over one billion products, and has been fully deployed across the platform’s advertiser-facing service.
📝 Abstract
Online sellers and advertisers are recommended keyphrases for their listed products, which they bid on to enhance their sales. One popular paradigm that generates such recommendations is Extreme Multi-Label Classification (XMC), which involves tagging/mapping keyphrases to items. We outline the limitations of using traditional item-query based tagging or mapping techniques for keyphrase recommendations on E-Commerce platforms. We introduce GraphEx, an innovative graph-based approach that recommends keyphrases to sellers using extraction of token permutations from item titles. Additionally, we demonstrate that relying on traditional metrics such as precision/recall can be misleading in practical applications, thereby necessitating a combination of metrics to evaluate performance in real-world scenarios. These metrics are designed to assess the relevance of keyphrases to items and the potential for buyer outreach. GraphEx outperforms production models at eBay, achieving the objectives mentioned above. It supports near real-time inferencing in resource-constrained production environments and scales effectively for billions of items.