Dyve: Thinking Fast and Slow for Dynamic Process Verification

📅 2025-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of dynamic error detection during large language model (LLM) reasoning. We propose Dyve, a dynamic process verification framework that systematically incorporates dual-process cognitive theory—integrating intuitive, rapid System 1 reasoning with deliberate, analytical System 2 reasoning—to enable step-level adaptive verification intensity control. Dyve introduces a novel stepwise consensus filtering supervision mechanism, combining Monte Carlo sampling with multi-perspective LLM evaluation to distill high-quality supervision signals from noisy feedback. Evaluated on ProcessBench and MATH benchmarks, Dyve significantly outperforms existing methods: under Best-of-N settings, it achieves substantial improvements in final-answer accuracy, demonstrating both effectiveness and generalizability across diverse reasoning tasks.

Technology Category

Application Category

📝 Abstract
We present Dyve, a dynamic process verifier that enhances reasoning error detection in large language models by integrating fast and slow thinking, inspired by Kahneman's Systems Theory. Dyve adaptively applies immediate token-level confirmation System 1 for straightforward steps and comprehensive analysis System 2 for complex ones. Leveraging a novel step-wise consensus-filtered process supervision technique, combining Monte Carlo estimation with LLM based evaluation, Dyve curates high-quality supervision signals from noisy data. Experimental results on ProcessBench and the MATH dataset confirm that Dyve significantly outperforms existing process-based verifiers and boosts performance in Best-of-N settings.
Problem

Research questions and friction points this paper is trying to address.

Enhances reasoning error detection
Integrates fast and slow thinking
Improves process-based verification accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fast and slow thinking
Uses step-wise consensus-filtered supervision
Combines Monte Carlo estimation with LLM evaluation
🔎 Similar Papers
No similar papers found.