DSMentor: Enhancing Data Science Agents with Curriculum Learning and Online Knowledge Accumulation

πŸ“… 2025-05-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the low reasoning efficiency and weak knowledge reuse of LLM-based agents in data science tasks, this paper proposes an optimization framework that synergistically integrates inference-time curriculum learning with online knowledge accumulation. Methodologically: (1) it introduces a difficulty-ordered task sequencing strategy to enable progressive curriculum learning during inference; (2) it designs a dynamically expandable long-term memory module to support continuous experience accumulation and retrieval; and (3) it incorporates an agent-level reasoning scheduling mechanism tailored to the DSEval and QRData benchmarks. The key innovation lies in the first application of curriculum learning directly at the LLM inference stageβ€”deeply coupled with online knowledge growth. Experiments demonstrate up to a 5.2% absolute improvement in pass rates on DSEval and QRData; notably, on causal reasoning tasks, the method outperforms GPT-4+PoT by 8.8%.

Technology Category

Application Category

πŸ“ Abstract
Large language model (LLM) agents have shown promising performance in generating code for solving complex data science problems. Recent studies primarily focus on enhancing in-context learning through improved search, sampling, and planning techniques, while overlooking the importance of the order in which problems are tackled during inference. In this work, we develop a novel inference-time optimization framework, referred to as DSMentor, which leverages curriculum learning -- a strategy that introduces simpler task first and progressively moves to more complex ones as the learner improves -- to enhance LLM agent performance in challenging data science tasks. Our mentor-guided framework organizes data science tasks in order of increasing difficulty and incorporates a growing long-term memory to retain prior experiences, guiding the agent's learning progression and enabling more effective utilization of accumulated knowledge. We evaluate DSMentor through extensive experiments on DSEval and QRData benchmarks. Experiments show that DSMentor using Claude-3.5-Sonnet improves the pass rate by up to 5.2% on DSEval and QRData compared to baseline agents. Furthermore, DSMentor demonstrates stronger causal reasoning ability, improving the pass rate by 8.8% on the causality problems compared to GPT-4 using Program-of-Thoughts prompts. Our work underscores the importance of developing effective strategies for accumulating and utilizing knowledge during inference, mirroring the human learning process and opening new avenues for improving LLM performance through curriculum-based inference optimization.
Problem

Research questions and friction points this paper is trying to address.

Optimizing LLM agent performance in data science tasks
Implementing curriculum learning for progressive difficulty handling
Enhancing knowledge accumulation and causal reasoning abilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Curriculum learning for task difficulty progression
Online knowledge accumulation with long-term memory
Mentor-guided inference-time optimization framework
πŸ”Ž Similar Papers
No similar papers found.