ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation

๐Ÿ“… 2026-01-05
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of unsupervised selection of high-quality outputs from multiple stochastic generations in open-ended tasks. The authors propose ModeX, a framework that generalizes majority voting to the semantic level by constructing a similarity graph over candidate texts and recursively applying spectral clustering to identify modal outputs that represent semantic consensusโ€”without requiring external evaluators or exact matches. A lightweight variant, ModeX-Lite, further enhances efficiency through an effective pruning strategy. Evaluated across diverse tasks including text summarization, code generation, and mathematical reasoning, ModeX significantly outperforms both single-path and multi-path baselines, achieving superior performance while maintaining computational efficiency. Notably, this approach enables semantic-level mode extraction without additional models or inference overhead.

Technology Category

Application Category

๐Ÿ“ Abstract
Selecting a single high-quality output from multiple stochastic generations remains a fundamental challenge for large language models (LLMs), particularly in open-ended tasks where no canonical answer exists. While Best-of-N and self-consistency methods show that aggregating multiple generations can improve performance, existing approaches typically rely on external evaluators, reward models, or exact string-match voting, limiting their applicability and efficiency. We propose Mode Extraction (ModeX), an evaluator-free Best-of-N selection framework that generalizes majority voting to open-ended text generation by identifying the modal output representing the dominant semantic consensus among generated texts. ModeX constructs a similarity graph over candidate generations and recursively applies spectral clustering to select a representative centroid, without requiring additional inference or auxiliary models. We further instantiate this selection principle as ModeX-Lite, an improved version of ModeX with early pruning for efficiency. Across open-ended tasks -- including text summarization, code generation, and mathematical reasoning -- our approaches consistently outperform standard single- and multi-path baselines, providing a computationally efficient solution for robust open-ended text generation. Code is released in https://github.com/deeplearning-wisc/ModeX.
Problem

Research questions and friction points this paper is trying to address.

open-ended generation
Best-of-N selection
output selection
large language models
semantic consensus
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mode Extraction
evaluator-free selection
spectral clustering
open-ended generation
Best-of-N
๐Ÿ”Ž Similar Papers
No similar papers found.