🤖 AI Summary
Current zero-shot large language model (LLM)-generated text detection methods rely on fixed proxy models, which struggle to generalize across unknown generation sources due to distributional shifts, leading to unstable performance. This work proposes DetectRouter, a novel framework that reframes detection as a dynamic routing problem. By introducing a prototype-based two-stage routing mechanism, it pioneers the modeling of proxy-input alignment as a geometric correspondence task, enabling robust generalization across both white-box and black-box generation sources. The approach integrates prototype learning, geometric distance alignment, and detection scores to construct a text-detector affinity model. Extensive evaluations on EvoBench and MAGE benchmarks demonstrate consistent and significant performance gains across diverse model families and detection metrics.
📝 Abstract
Zero-shot methods detect LLM-generated text by computing statistical signatures using a surrogate model. Existing approaches typically employ a fixed surrogate for all inputs regardless of the unknown source. We systematically examine this design and find that detection performance varies substantially depending on surrogate-source alignment. We observe that while no single surrogate achieves optimal performance universally, a well-matched surrogate typically exists within a diverse pool for any given input. This finding transforms robust detection into a routing problem: selecting the most appropriate surrogate for each input. We propose DetectRouter, a prototype-based framework that learns text-detector affinity through two-stage training. The first stage constructs discriminative prototypes from white-box models; the second generalizes to black-box sources by aligning geometric distances with observed detection scores. Experiments on EvoBench and MAGE benchmarks demonstrate consistent improvements across multiple detection criteria and model families.