CCQA: Generating Question from Solution Can Improve Inference-Time Reasoning in SLMs

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Small language models (SLMs) suffer from low reasoning accuracy, and conventional optimization methods yield limited improvements. Method: This paper proposes Cycle-Consistent Question Answering (CCQA), a novel inference refinement framework that introduces a backward question generation mechanism: given a candidate answer derived from the reasoning path, CCQA reconstructs a question and computes its semantic similarity with the original input; the optimal answer is selected based on cycle consistency—i.e., high similarity between the reconstructed and original questions—without requiring additional training or parameter expansion. Contribution/Results: CCQA is the first systematic application of cycle consistency to SLM reasoning optimization. Implemented via a lightweight Flan-T5-based question generator, it achieves consistent and significant gains across eight distinct SLMs on both mathematical and commonsense reasoning benchmarks, surpassing existing state-of-the-art methods. The approach establishes an efficient, scalable, and training-free paradigm for enhancing reasoning in small language models.

Technology Category

Application Category

📝 Abstract
Recently, inference-time reasoning strategies have further improved the accuracy of large language models (LLMs), but their effectiveness on smaller models remains unclear. Based on the observation that conventional approaches often fail to improve performance in this context, we propose extbf{C}ycle- extbf{C}onsistency in extbf{Q}uestion extbf{A}nswering (CCQA), a novel reasoning method that can be effectively applied to SLMs. Inspired by cycle consistency, CCQA generates a question from each reasoning path and answer, evaluates each by its similarity to the original question, and then selects the candidate solution with the highest similarity score as the final response. Since conventional SLMs struggle to generate accurate questions from their own reasoning paths and answers, we employ a lightweight Flan-T5 model specialized for question generation to support this process efficiently. From the experimental results, it is verified that CCQA consistently outperforms existing state-of-the-art (SOTA) methods across eight models on mathematical and commonsense reasoning benchmarks. Furthermore, our method establishes a new practical baseline for efficient reasoning in SLMs. Source code can be found at https://github.com/scai-research/ccqa_official.
Problem

Research questions and friction points this paper is trying to address.

Improving reasoning accuracy in small language models
Addressing failure of conventional approaches on SLMs
Developing efficient inference-time reasoning method for SLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates questions from reasoning paths and answers
Uses lightweight Flan-T5 model for question generation
Selects solution based on similarity to original question
🔎 Similar Papers
No similar papers found.