MindLoom: Composing Thought Modes for Frontier-Level Reasoning Data Synthesis

๐Ÿ“… 2026-05-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

202K/year
๐Ÿค– AI Summary
Existing approaches struggle to systematically generate high-quality, challenging, and diverse reasoning data, primarily due to the lack of an interpretable, structured model of problem difficulty. This work introduces the concept of โ€œreasoning motifs,โ€ modeling reasoning difficulty as compositions of atomic knowledge and inference transformations. It presents a compositional engineering framework that extracts chains of reasoning motifs by decomposing solutions to hard problems and integrates retrieval-augmented synthesis, distribution-aligned sampling, and a rollout-based difficulty discriminator to enable controllable generation of state-of-the-art reasoning data. Evaluated across nine benchmarks spanning five STEM domains and four mathematical task types, models fine-tuned on the generated MindLoom data consistently outperform baselines, demonstrating the methodโ€™s effectiveness in enhancing data diversity, controllable difficulty, and model generalization.
๐Ÿ“ Abstract
Although LLMs have made substantial progress in reasoning, systematically producing frontier-level reasoning data remains difficult. Existing synthesis methods often have limited visibility into the structural factors that govern problem difficulty, which can result in narrow diversity and unstable difficulty control. In this work, we view the difficulty of a reasoning problem as arising from the accumulation of atomic knowledge-reasoning transformations, which we term thought modes. Building on this perspective, we propose MindLoom, a framework for synthesizing frontier-level reasoning data through compositional thought mode engineering. Given a collection of hard problems with verified solutions, MindLoom first decomposes those solutions into thought mode chains that reveal each problem's construction logic. It then trains a retrieval model that matches problem states to compatible thought modes, providing guidance on which reasoning challenges to introduce during synthesis. New problems are composed by iteratively applying retrieved thought modes to seed questions, with distribution-aligned sampling to encourage diverse reasoning coverage. Finally, a rollout-based judging stage labels generated questions by difficulty and supplies judged-correct responses for supervised fine-tuning. We evaluate MindLoom on nine benchmarks covering five STEM disciplines and four mathematical reasoning tasks across multiple model families and sizes. Models fine-tuned on MindLoom-generated data achieves favorable performances over base models, distillation, and external-data baselines across the reported benchmarks. Ablation studies indicate the contribution of each component, and further analysis suggests that MindLoom covers a broad range of reasoning patterns while maintaining useful difficulty control. We have open-sourced our implementation at https://github.com/EachSheep/MindLoom.
Problem

Research questions and friction points this paper is trying to address.

reasoning data synthesis
problem difficulty
thought modes
frontier-level reasoning
diversity control
Innovation

Methods, ideas, or system contributions that make the work stand out.

thought modes
reasoning data synthesis
compositional reasoning
difficulty control
mindloom
๐Ÿ”Ž Similar Papers
No similar papers found.