Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks

๐Ÿ“… 2025-09-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high computational cost and self-assessment bias inherent in large language models (LLMs) performing self-improvement on unverifiable open-ended tasks (e.g., machine translation), this paper proposes a lightweight semantic voting method that eliminates explicit self-evaluation. The core innovation extends conventional hard-matching majority voting to soft voting based on semantic similarity, leveraging efficient sentence embedding models to generate high-quality pseudo-labelsโ€”thereby avoiding the computational overhead and overconfidence issues associated with self-scoring and entropy minimization. Experiments across diverse model architectures and translation benchmarks demonstrate significant improvements: average inference speedup of 2.3ร—, and gains of +1.8 BLEU and +2.4 COMET scores. Moreover, the method exhibits strong generalizability across tasks and models. This work establishes a more reliable and scalable unsupervised paradigm for LLM self-optimization.

Technology Category

Application Category

๐Ÿ“ Abstract
The rising cost of acquiring supervised data has driven significant interest in self-improvement for large language models (LLMs). Straightforward unsupervised signals like majority voting have proven effective in generating pseudo-labels for verifiable tasks, while their applicability to unverifiable tasks (e.g., translation) is limited by the open-ended character of responses. As a result, self-evaluation mechanisms (e.g., self-judging and entropy minimization) are predominantly used to derive pseudo-labels. However, self-evaluation relying on LLMs typically incurs high computational overhead and introduces overconfidence issues due to intrinsic biases. To address these challenges, we propose a novel self-evaluation-free approach for unverifiable tasks, designed for lightweight yet effective self-improvement. Inspired by majority voting commonly employed in verifiable tasks, we propose semantic voting as a novel mechanism that relaxes the principle of hard matching (i.e., exact matching) toward soft matching (i.e., semantic similarity). Soft matching is achieved by leveraging a lightweight sentence embedding model to quantify semantic similarity, thereby mitigating excessive computational burden and intrinsic bias-associated limitations of self-evaluation. Comprehensive experiments demonstrate that our method achieves substantial gains in computational efficiency and overall better performance than self-evaluation methods across diverse model architectures and tasks.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational costs in LLM self-improvement for unverifiable tasks
Eliminating overconfidence bias from self-evaluation mechanisms in LLMs
Enabling efficient pseudo-labeling for open-ended tasks using semantic similarity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic voting replaces hard matching with soft matching
Uses lightweight sentence embedding for semantic similarity
Eliminates self-evaluation to reduce computational overhead
๐Ÿ”Ž Similar Papers
No similar papers found.