🤖 AI Summary
This paper addresses the challenge of dynamic error detection during large language model (LLM) reasoning. We propose Dyve, a dynamic process verification framework that systematically incorporates dual-process cognitive theory—integrating intuitive, rapid System 1 reasoning with deliberate, analytical System 2 reasoning—to enable step-level adaptive verification intensity control. Dyve introduces a novel stepwise consensus filtering supervision mechanism, combining Monte Carlo sampling with multi-perspective LLM evaluation to distill high-quality supervision signals from noisy feedback. Evaluated on ProcessBench and MATH benchmarks, Dyve significantly outperforms existing methods: under Best-of-N settings, it achieves substantial improvements in final-answer accuracy, demonstrating both effectiveness and generalizability across diverse reasoning tasks.
📝 Abstract
We present Dyve, a dynamic process verifier that enhances reasoning error detection in large language models by integrating fast and slow thinking, inspired by Kahneman's Systems Theory. Dyve adaptively applies immediate token-level confirmation System 1 for straightforward steps and comprehensive analysis System 2 for complex ones. Leveraging a novel step-wise consensus-filtered process supervision technique, combining Monte Carlo estimation with LLM based evaluation, Dyve curates high-quality supervision signals from noisy data. Experimental results on ProcessBench and the MATH dataset confirm that Dyve significantly outperforms existing process-based verifiers and boosts performance in Best-of-N settings.