FlashEvaluator: Expanding Search Space with Parallel Evaluation

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the limitations of conventional generator-evaluator frameworks, where evaluators lack explicit cross-sequence comparison and suffer from poor parallelization efficiency, thereby constraining both accuracy and system throughput. To overcome these challenges, we propose FlashEvaluator, the first method capable of enabling token-level information sharing across sequences within a single forward pass, facilitating efficient parallel evaluation and direct sequence-to-sequence comparison. Built upon a unified single-pass architecture, cross-sequence interaction mechanisms, and sublinear complexity design, FlashEvaluator demonstrates strong theoretical guarantees and empirical effectiveness at scale. Extensive experiments on recommendation and NLP tasks show significant performance gains over existing approaches. Notably, the method has been deployed in Kuaishou’s production recommendation system, delivering consistent and substantial revenue improvements.

Technology Category

Application Category

📝 Abstract

The Generator-Evaluator (G-E) framework, i.e., evaluating K sequences from a generator and selecting the top-ranked one according to evaluator scores, is a foundational paradigm in tasks such as Recommender Systems (RecSys) and Natural Language Processing (NLP). Traditional evaluators process sequences independently, suffering from two major limitations: (1) lack of explicit cross-sequence comparison, leading to suboptimal accuracy; (2) poor parallelization with linear complexity of O(K), resulting in inefficient resource utilization and negative impact on both throughput and latency. To address these challenges, we propose FlashEvaluator, which enables cross-sequence token information sharing and processes all sequences in a single forward pass. This yields sublinear computational complexity that improves the system's efficiency and supports direct inter-sequence comparisons that improve selection accuracy. The paper also provides theoretical proofs and extensive experiments on recommendation and NLP tasks, demonstrating clear advantages over conventional methods. Notably, FlashEvaluator has been deployed in online recommender system of Kuaishou, delivering substantial and sustained revenue gains in practice.

Problem

Research questions and friction points this paper is trying to address.

Generator-Evaluator framework

cross-sequence comparison

parallelization

computational complexity

sequence evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

FlashEvaluator

parallel evaluation

cross-sequence comparison