🤖 AI Summary
Existing vector quantization–based item indexing methods struggle to handle the highly skewed and non-stationary item distributions prevalent in streaming recommendation systems, resulting in low assignment accuracy, imbalanced clustering, and insufficient inter-cluster separation. To address these challenges, this work proposes MERGE, an adaptive hierarchical item indexing paradigm that dynamically constructs clusters from scratch, continuously monitors cluster occupancy in real time, and employs a fine-to-coarse merging strategy to build a hierarchical structure. MERGE substantially improves assignment accuracy, cluster uniformity, and inter-cluster separation. Online A/B tests further demonstrate its significant gains on key business metrics, validating its practical effectiveness in real-world streaming recommendation scenarios.
📝 Abstract
Item indexing, which maps a large corpus of items into compact discrete representations, is critical for both discriminative and generative recommender systems, yet existing Vector Quantization (VQ)-based approaches struggle with the highly skewed and non-stationary item distributions common in streaming industry recommenders, leading to poor assignment accuracy, imbalanced cluster occupancy, and insufficient cluster separation. To address these challenges, we propose MERGE, a next-generation item indexing paradigm that adaptively constructs clusters from scratch, dynamically monitors cluster occupancy, and forms hierarchical index structures via fine-to-coarse merging. Extensive experiments demonstrate that MERGE significantly improves assignment accuracy, cluster uniformity, and cluster separation compared with existing indexing methods, while online A/B tests show substantial gains in key business metrics, highlighting its potential as a foundational indexing approach for large-scale recommendation.