🤖 AI Summary
This work addresses the challenge of modeling relevance in real-world scenarios where massive, dynamically evolving query streams suffer from sparse labeled data and unreliable pseudo-labels. To overcome these limitations, the authors propose a self-evolving relevance modeling framework featuring a dual-module multi-agent architecture: a multi-agent sample miner that detects distribution shifts and selects critical instances, and a multi-agent labeler that generates high-confidence pseudo-labels through a two-tier consensus mechanism. By integrating distribution shift awareness, collaborative pseudo-label refinement, and iterative self-evolution, the method significantly improves ranking performance. Extensive offline and online evaluations across multiple languages in an industrial system handling billions of daily requests demonstrate its effectiveness and scalability.
📝 Abstract
Due to the dynamically evolving nature of real-world query streams, relevance models struggle to generalize to practical search scenarios. A sophisticated solution is self-evolution techniques. However, in large-scale industrial settings with massive query streams, this technique faces two challenges: (1) informative samples are often sparse and difficult to identify, and (2) pseudo-labels generated by the current model could be unreliable. To address these challenges, in this work, we propose a Self-Evolving Relevance Model approach (SERM), which comprises two complementary multi-agent modules: a multi-agent sample miner, designed to detect distributional shifts and identify informative training samples, and a multi-agent relevance annotator, which provides reliable labels through a two-level agreement framework. We evaluate SERM in a large-scale industrial setting, which serves billions of user requests daily. Experimental results demonstrate that SERM can achieve significant performance gains through iterative self-evolution, as validated by extensive offline multilingual evaluations and online testing.