🤖 AI Summary
To address gradient vanishing and inadequate long-range dependency modeling in conventional LSTMs for stock trading, this paper pioneers the integration of extended Long Short-Term Memory (xLSTM) into a deep reinforcement learning (DRL) trading framework, proposing an end-to-end automated trading system based on xLSTM and Proximal Policy Optimization (PPO). The method employs an Actor-Critic architecture wherein both networks incorporate xLSTM units to enhance temporal representation learning, augmented by domain-informed financial feature engineering to improve decision robustness. Empirical evaluation on multi-stock datasets from leading technology firms demonstrates that the proposed approach achieves a 12.7% higher cumulative return, a 23.4% improvement in Sharpe ratio, and an 18.9% reduction in maximum drawdown compared to LSTM-based baselines. These results substantiate significant gains in risk-adjusted performance and strategy stability, establishing xLSTM as a superior temporal backbone for DRL-based algorithmic trading.
📝 Abstract
Traditional Long Short-Term Memory (LSTM) networks are effective for handling sequential data but have limitations such as gradient vanishing and difficulty in capturing long-term dependencies, which can impact their performance in dynamic and risky environments like stock trading. To address these limitations, this study explores the usage of the newly introduced Extended Long Short Term Memory (xLSTM) network in combination with a deep reinforcement learning (DRL) approach for automated stock trading. Our proposed method utilizes xLSTM networks in both actor and critic components, enabling effective handling of time series data and dynamic market environments. Proximal Policy Optimization (PPO), with its ability to balance exploration and exploitation, is employed to optimize the trading strategy. Experiments were conducted using financial data from major tech companies over a comprehensive timeline, demonstrating that the xLSTM-based model outperforms LSTM-based methods in key trading evaluation metrics, including cumulative return, average profitability per trade, maximum earning rate, maximum pullback, and Sharpe ratio. These findings mark the potential of xLSTM for enhancing DRL-based stock trading systems.