ELATE: Evolutionary Language model for Automated Time-series Engineering

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In time-series forecasting, manual feature engineering is labor-intensive and inefficient, while existing automated approaches often rely on computationally expensive exhaustive search and lack domain awareness. To address these limitations, we propose the first paradigm integrating large language models (LLMs) into an evolutionary feature generation framework. Specifically, the LLM generates semantically meaningful and context-aware candidate transformations grounded in time-series statistics and feature importance; an evolutionary algorithm then efficiently searches and prunes this space, substantially reducing redundant computation. Our method eliminates reliance on hand-crafted rules or brute-force enumeration. Evaluated across diverse benchmarks, it achieves an average 8.4% improvement in forecasting accuracy. This demonstrates that language-guided intelligent feature engineering delivers synergistic gains in interpretability, computational efficiency, and predictive performance.

Technology Category

Application Category

📝 Abstract
Time-series prediction involves forecasting future values using machine learning models. Feature engineering, whereby existing features are transformed to make new ones, is critical for enhancing model performance, but is often manual and time-intensive. Existing automation attempts rely on exhaustive enumeration, which can be computationally costly and lacks domain-specific insights. We introduce ELATE (Evolutionary Language model for Automated Time-series Engineering), which leverages a language model within an evolutionary framework to automate feature engineering for time-series data. ELATE employs time-series statistical measures and feature importance metrics to guide and prune features, while the language model proposes new, contextually relevant feature transformations. Our experiments demonstrate that ELATE improves forecasting accuracy by an average of 8.4% across various domains.
Problem

Research questions and friction points this paper is trying to address.

Automating time-series feature engineering to avoid manual effort
Reducing computational costs of exhaustive enumeration methods
Incorporating domain-specific insights into feature transformation processes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolutionary framework automates time-series feature engineering
Language model proposes contextually relevant feature transformations
Statistical measures and importance metrics guide feature pruning