Does Self-Consistency Improve the Recall of Encyclopedic Knowledge?

πŸ“… 2026-04-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

177K/year
πŸ€– AI Summary
Self-consistency has demonstrated strong performance in symbolic reasoning, yet its efficacy for factual knowledge recall remains underexplored. This work presents the first systematic evaluation of self-consistency on knowledge-intensive tasks by constructing a specialized subset of the MMLU benchmark tailored for knowledge recall. Using a data-driven heuristic to partition the dataset, the study integrates chain-of-thought (CoT) prompting with self-consistency decoding and evaluates the approach on GPT-4o. The results show that self-consistency substantially improves performance on knowledge recall, achieving an accuracy of 89%β€”the highest reported score for GPT-4o on MMLU to date. These findings significantly broaden the recognized applicability of self-consistency beyond symbolic reasoning to include robust factual retrieval.

Technology Category

Application Category

πŸ“ Abstract
While self-consistency is known to improve performance on symbolic reasoning, its effect on the recall of encyclopedic knowledge is unclear due to a lack of targeted evaluation grounds. To address this, we establish such a knowledge recall split for the popular MMLU benchmark by applying a data-driven heuristic from prior work. We validate this split by showing that the performance patterns on the symbolic reasoning and knowledge recall subsets mirror those of GSM8K and MedMCQA, respectively. Using this solid ground, we find that self-consistency consistently improves performance across both symbolic reasoning and knowledge recall, even though its underlying CoT prompting is primarily effective for symbolic reasoning. As a result, we achieve an 89\% accuracy on MMLU, the best performance to date with the use of GPT-4o.
Problem

Research questions and friction points this paper is trying to address.

self-consistency
encyclopedic knowledge
knowledge recall
MMLU benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-consistency
knowledge recall
MMLU benchmark
chain-of-thought prompting
encyclopedic knowledge
πŸ”Ž Similar Papers
No similar papers found.