MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning

📅 2024-09-18

🏛️ arXiv.org

📈 Citations: 22

✨ Influential: 2

career value

228K/year

🤖 AI Summary

To address three key challenges in LLM-based reasoning—over-refinement, difficulty in error localization, and uncertainty in optimal iteration count—this paper proposes a difficulty-aware framework integrating coarse-grained aggregation with fine-grained multi-agent collaborative refinement. It introduces a novel two-level adaptive reasoning paradigm: coarse-grained aggregation for efficiency and fine-grained multi-agent refinement for precision. A step-level reward model enables precise, granular error localization, while a closed-loop tri-agent mechanism (Solver-Reviewer-Refiner) supports dynamic iteration termination. The framework unifies Chain-of-Thought (CoT) and Self-Consistency, augmented by a difficulty classifier and iterative re-evaluation module. Evaluated on five mathematical reasoning benchmarks, it outperforms Self-Consistency (+3.4%), Best-of-k (+3.2%), and Self-Refine (+4.0%). It achieves peak performance within a single inference pass—halving sample consumption—while sustaining consistent gains across multiple iterations.

Technology Category

Application Category

📝 Abstract

Large Language Models' (LLM) reasoning can be improved using test-time aggregation strategies, i.e., generating multiple samples and voting among generated samples. While these improve performance, they often reach a saturation point. Refinement offers an alternative by using LLM-generated feedback to improve solution quality. However, refinement introduces 3 key challenges: (1) Excessive refinement: Uniformly refining all instances can over-correct and reduce the overall performance. (2) Inability to localize and address errors: LLMs have a limited ability to self-correct and struggle to identify and correct their own mistakes. (3) Insufficient refinement: Deciding how many iterations of refinement are needed is non-trivial, and stopping too soon could leave errors unaddressed. To tackle these issues, we propose MAgICoRe, which avoids excessive refinement by categorizing problem difficulty as easy or hard, solving easy problems with coarse-grained aggregation and hard ones with fine-grained and iterative multi-agent refinement. To improve error localization, we incorporate external step-wise reward model (RM) scores. Moreover, to ensure effective refinement, we employ a multi-agent loop with three agents: Solver, Reviewer (which generates targeted feedback based on step-wise RM scores), and the Refiner (which incorporates feedback). To ensure sufficient refinement, we re-evaluate updated solutions, iteratively initiating further rounds of refinement. We evaluate MAgICoRe on Llama-3-8B and GPT-3.5 and show its effectiveness across 5 math datasets. Even one iteration of MAgICoRe beats Self-Consistency by 3.4%, Best-of-k by 3.2%, and Self-Refine by 4.0% while using less than half the samples. Unlike iterative refinement with baselines, MAgICoRe continues to improve with more iterations. Finally, our ablations highlight the importance of MAgICoRe's RMs and multi-agent communication.

Problem

Research questions and friction points this paper is trying to address.

Addresses excessive refinement by categorizing problem difficulty

Improves error localization using step-wise reward model scores

Ensures sufficient refinement via iterative multi-agent feedback loop

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent loop with solver, reviewer, refiner roles

Coarse-to-fine refinement based on problem difficulty

Step-wise reward model scores for error localization

🔎 Similar Papers

Derailer-Rerailer: Adaptive Verification for Efficient and Reliable Language Model Reasoning