π€ AI Summary
Large language models (LLMs) exhibit limited reasoning capabilities and low efficiency when tackling long-horizon, complex decision-making tasks.
Method: We propose a self-evolving curriculum learning framework that constructs progressive problem sequences. It dynamically assesses model performance in real time and adaptively adjusts problem difficulty to co-evolve personalized learning paths and behavioral code. The framework integrates LLM-driven curriculum generation, decision-tree-based script output, and feedback-guided difficulty modulation.
Contribution/Results: On challenging long-horizon decision benchmarks, our method significantly improves task success rate (+28.6%) and inference efficiency (37.4% reduction in reasoning steps). It is the first work to systematically introduce curriculum learning into structured, long-horizon reasoning for LLMs, empirically validating the effectiveness and scalability of difficulty-adaptive evolution for enhancing LLMsβ deep decision-making capabilities.
π Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, including programming, planning, and decision-making. However, their performance often degrades when faced with highly complex problem instances that require deep reasoning over long horizons. In such cases, direct problem-solving approaches can lead to inefficiency or failure due to the lack of structured intermediate guidance. To address this, we propose a novel self-evolve framework, EvoCurr, in which a dedicated curriculum-generation LLM constructs a sequence of problem instances with gradually increasing difficulty, tailored to the solver LLM's learning progress. The curriculum dynamically adapts easing challenges when the solver struggles and escalating them when success is consistent, thus maintaining an optimal learning trajectory. This approach enables the solver LLM, implemented as a code-generation model producing Python decision-tree scripts, to progressively acquire the skills needed for complex decision-making tasks. Experimental results on challenging decision-making benchmarks show that our method significantly improves task success rates and solution efficiency compared to direct-solving baselines. These findings suggest that LLM-driven curriculum learning holds strong potential for enhancing automated reasoning in real-world, high-complexity domains.