🤖 AI Summary
To address the unreliability of large language models (LLMs) in complex reasoning tasks and the underutilization of intermediate reasoning traces, this paper proposes Semantic Self-Consistency (SSC). SSC is the first method to incorporate semantic similarity modeling into the self-consistency framework: it jointly encodes both reasoning paths (e.g., using Sentence-BERT embeddings) and final answers, then performs weighted voting based on path-level semantic similarity—rather than simple answer aggregation. Evaluated on multiple mathematical and symbolic reasoning benchmarks, SSC significantly improves reasoning robustness, increasing reasoning-path consistency by 23% and average answer accuracy by 7.4% over standard self-consistency. Its core contribution lies in explicitly integrating intermediate reasoning semantics into the decision mechanism, thereby enhancing both interpretability and trustworthiness of LLM-generated reasoning.
📝 Abstract
While large language models (LLMs) have rapidly improved their performance on a broad number of tasks, they still often fall short on reasoning tasks. As LLMs become more integrated in diverse real-world tasks, advancing their reasoning capabilities is crucial to their effectiveness in nuanced, complex problems. Wang et al.'s self-consistency framework reveals that sampling multiple rationales before taking a majority vote reliably improves model performance across various closed-answer reasoning tasks. Standard methods based on this framework aggregate the final decisions of these rationales but fail to utilize the semantic information detailed in the step-by-step reasoning paths. Our work introduces semantic self-consistency, enhancing this approach by incorporating and analyzing both the reasoning paths of these rationales in addition to their final decisions before taking a majority vote. These methods not only improve the reliability of reasoning paths but also cause more robust performance on complex reasoning tasks.