LLaPipe: LLM-Guided Reinforcement Learning for Automated Data Preparation Pipeline Construction

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Existing reinforcement learning (RL)-based approaches for automatic data preprocessing pipeline construction suffer from inefficient exploration in vast search spaces. To address this, we propose LLM-Augmented RL: a framework that integrates a large language model (LLM) as a semantic policy advisor to guide operator selection, incorporates experience distillation to reuse historically successful strategies, and employs an adaptive advisor-triggering mechanism to dynamically regulate LLM invocation timing. Experiments across 18 cross-domain datasets demonstrate that our method achieves up to a 22.4% improvement in pipeline quality over state-of-the-art baselines, accelerates convergence by 2.3×, and reduces LLM calls to only 19.0% of total exploration steps—striking a strong balance between performance gains and computational efficiency.

Technology Category

Application Category

📝 Abstract

Automated data preparation is crucial for democratizing machine learning, yet existing reinforcement learning (RL) based approaches suffer from inefficient exploration in the vast space of possible preprocessing pipelines. We present LLaPipe, a novel framework that addresses this exploration bottleneck by integrating Large Language Models (LLMs) as intelligent policy advisors. Unlike traditional methods that rely solely on statistical features and blind trial-and-error, LLaPipe leverages the semantic understanding capabilities of LLMs to provide contextually relevant exploration guidance. Our framework introduces three key innovations: (1) an LLM Policy Advisor that analyzes dataset semantics and pipeline history to suggest promising preprocessing operations, (2) an Experience Distillation mechanism that mines successful patterns from past pipelines and transfers this knowledge to guide future exploration, and (3) an Adaptive Advisor Triggering strategy (Advisor extsuperscript{+}) that dynamically determines when LLM intervention is most beneficial, balancing exploration effectiveness with computational cost. Through extensive experiments on 18 diverse datasets spanning multiple domains, we demonstrate that LLaPipe achieves up to 22.4% improvement in pipeline quality and 2.3$ imes$ faster convergence compared to state-of-the-art RL-based methods, while maintaining computational efficiency through selective LLM usage (averaging only 19.0% of total exploration steps).

Problem

Research questions and friction points this paper is trying to address.

Improves exploration efficiency in automated data preparation pipelines

Integrates LLMs to guide reinforcement learning for pipeline construction

Enhances pipeline quality and convergence speed with selective LLM usage

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM Policy Advisor suggests preprocessing operations

Experience Distillation mines successful pipeline patterns

Adaptive Advisor Triggering balances cost and effectiveness

🔎 Similar Papers

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML