Asking the Right Questions: Improving Reasoning with Generated Stepping Stones

📅 2026-02-22

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work addresses the limited performance of large language models (LLMs) on complex reasoning tasks due to the absence of effective intermediate guidance. To this end, the authors propose the ARQ framework, which constitutes the first systematic investigation into the efficacy and transferability of “stepping-stone” problems—such as simplified, reformulated, or sub-problems—and formalizes their generation as a post-training task. By leveraging synthetically generated data combined with supervised fine-tuning (SFT) and reinforcement learning (RL), the model is trained to autonomously produce high-quality intermediate questions. Experimental results demonstrate that this approach significantly improves the success rate of LLMs across varying capability levels on challenging reasoning benchmarks, including mathematical problem solving and programming tasks.

Technology Category

Application Category

📝 Abstract

Recent years have witnessed tremendous progress in enabling LLMs to solve complex reasoning tasks such as math and coding. As we start to apply LLMs to harder tasks that they may not be able to solve in one shot, it is worth paying attention to their ability to construct intermediate stepping stones that prepare them to better solve the tasks. Examples of stepping stones include simplifications, alternative framings, or subproblems. We study properties and benefits of stepping stones in the context of modern reasoning LLMs via ARQ (\textbf{A}king the \textbf{R}ight \textbf{Q}uestions), our simple framework which introduces a question generator to the default reasoning pipeline. We first show that good stepping stone questions exist and are transferrable, meaning that good questions can be generated, and they substantially help LLMs of various capabilities in solving the target tasks. We next frame stepping stone generation as a post-training task and show that we can fine-tune LLMs to generate more useful stepping stones by SFT and RL on synthetic data.

Problem

Research questions and friction points this paper is trying to address.

reasoning

stepping stones

large language models

question generation

complex tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

stepping stones

question generation

reasoning enhancement