🤖 AI Summary
In time-series forecasting, manual feature engineering is labor-intensive and inefficient, while existing automated approaches often rely on computationally expensive exhaustive search and lack domain awareness. To address these limitations, we propose the first paradigm integrating large language models (LLMs) into an evolutionary feature generation framework. Specifically, the LLM generates semantically meaningful and context-aware candidate transformations grounded in time-series statistics and feature importance; an evolutionary algorithm then efficiently searches and prunes this space, substantially reducing redundant computation. Our method eliminates reliance on hand-crafted rules or brute-force enumeration. Evaluated across diverse benchmarks, it achieves an average 8.4% improvement in forecasting accuracy. This demonstrates that language-guided intelligent feature engineering delivers synergistic gains in interpretability, computational efficiency, and predictive performance.
📝 Abstract
Time-series prediction involves forecasting future values using machine learning models. Feature engineering, whereby existing features are transformed to make new ones, is critical for enhancing model performance, but is often manual and time-intensive. Existing automation attempts rely on exhaustive enumeration, which can be computationally costly and lacks domain-specific insights. We introduce ELATE (Evolutionary Language model for Automated Time-series Engineering), which leverages a language model within an evolutionary framework to automate feature engineering for time-series data. ELATE employs time-series statistical measures and feature importance metrics to guide and prune features, while the language model proposes new, contextually relevant feature transformations. Our experiments demonstrate that ELATE improves forecasting accuracy by an average of 8.4% across various domains.