Confidence-guided Refinement Reasoning for Zero-shot Question Answering

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

This work addresses low answer confidence and inconsistent reasoning in zero-shot cross-modal (text/image/video) question answering. We propose a training-free, sub-question-guided reasoning framework that automatically decomposes an input question into multiple sub-question–answer (sub-QA) paths. Leveraging the large language model’s intrinsic confidence scores over its own outputs, our method dynamically refines and weight-fuses these sub-paths to enhance multi-path reasoning consistency. Our key contribution is the first introduction of a confidence-driven, sub-question quality-aware refinement mechanism, which empirically reveals the nonlinear relationship between sub-question quantity/quality and reasoning robustness. Experiments demonstrate consistent accuracy improvements across multimodal QA benchmarks, broad compatibility with both open- and closed-source QA models, and substantial gains in zero-shot generalization capability and decision reliability.

Technology Category

Application Category

📝 Abstract

We propose Confidence-guided Refinement Reasoning (C2R), a novel training-free framework applicable to question-answering (QA) tasks across text, image, and video domains. C2R strategically constructs and refines sub-questions and their answers (sub-QAs), deriving a better confidence score for the target answer. C2R first curates a subset of sub-QAs to explore diverse reasoning paths, then compares the confidence scores of the resulting answer candidates to select the most reliable final answer. Since C2R relies solely on confidence scores derived from the model itself, it can be seamlessly integrated with various existing QA models, demonstrating consistent performance improvements across diverse models and benchmarks. Furthermore, we provide essential yet underexplored insights into how leveraging sub-QAs affects model behavior, specifically analyzing the impact of both the quantity and quality of sub-QAs on achieving robust and reliable reasoning.

Problem

Research questions and friction points this paper is trying to address.

Improving zero-shot QA accuracy across text, image, and video domains

Enhancing answer confidence scoring without requiring additional training

Analyzing how sub-question quantity and quality affects reasoning reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework using confidence-guided refinement reasoning

Constructs and refines sub-questions across multimodal domains

Leverages model's own confidence scores for answer selection

🔎 Similar Papers

No similar papers found.