๐ค AI Summary
Traditional verifiers rely on expensive, low-diversity supervised data, limiting their ability to efficiently enhance the reasoning capabilities of large language models. This work proposes LOVERโthe first unsupervised verifier that operates without any supervised dataโby modeling the verification process as a binary latent variable and imposing three types of logical constraints (negation consistency, intra-group consistency, and inter-group consistency) derived from internal model activations to guide the generation of reliable reasoning paths. Notably, LOVER is the first to incorporate logical rules as priors in the design of an unsupervised verifier and can be directly applied to any off-the-shelf large language model. Evaluated across ten benchmark datasets, LOVER substantially outperforms existing unsupervised methods, achieving on average 95% of the performance of supervised verifiers.
๐ Abstract
Verifiers are crucial components for enhancing modern LLMs' reasoning capability. Typicalverifiers require resource-intensive superviseddataset construction, which is costly and faceslimitations in data diversity. In this paper, wepropose LOVER, an unsupervised verifier regularized by logical rules. LOVER treats theverifier as a binary latent variable, utilizinginternal activations and enforcing three logical constraints on multiple reasoning paths:negation consistency, intra-group consistency,and inter-group consistency (grouped by thefinal answer). By incorporating logical rulesas priors, LOVER can leverage unlabeled examples and is directly compatible with any offthe-shelf LLMs. Experiments on 10 datasetsdemonstrate that LOVER significantly outperforms unsupervised baselines, achieving performance comparable to the supervised verifier(reaching its 95% level on average). The sourcecode is publicly available at https://github.com/wangxinyufighting/llm-lover.