Logic-Regularized Verifier Elicits Reasoning from LLMs

๐Ÿ“… 2026-05-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

177K/year
๐Ÿค– AI Summary
Traditional verifiers rely on expensive, low-diversity supervised data, limiting their ability to efficiently enhance the reasoning capabilities of large language models. This work proposes LOVERโ€”the first unsupervised verifier that operates without any supervised dataโ€”by modeling the verification process as a binary latent variable and imposing three types of logical constraints (negation consistency, intra-group consistency, and inter-group consistency) derived from internal model activations to guide the generation of reliable reasoning paths. Notably, LOVER is the first to incorporate logical rules as priors in the design of an unsupervised verifier and can be directly applied to any off-the-shelf large language model. Evaluated across ten benchmark datasets, LOVER substantially outperforms existing unsupervised methods, achieving on average 95% of the performance of supervised verifiers.
๐Ÿ“ Abstract
Verifiers are crucial components for enhancing modern LLMs' reasoning capability. Typicalverifiers require resource-intensive superviseddataset construction, which is costly and faceslimitations in data diversity. In this paper, wepropose LOVER, an unsupervised verifier regularized by logical rules. LOVER treats theverifier as a binary latent variable, utilizinginternal activations and enforcing three logical constraints on multiple reasoning paths:negation consistency, intra-group consistency,and inter-group consistency (grouped by thefinal answer). By incorporating logical rulesas priors, LOVER can leverage unlabeled examples and is directly compatible with any offthe-shelf LLMs. Experiments on 10 datasetsdemonstrate that LOVER significantly outperforms unsupervised baselines, achieving performance comparable to the supervised verifier(reaching its 95% level on average). The sourcecode is publicly available at https://github.com/wangxinyufighting/llm-lover.
Problem

Research questions and friction points this paper is trying to address.

verifier
reasoning
large language models
supervised dataset
data diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

unsupervised verifier
logical constraints
reasoning paths
consistency regularization
large language models