Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient robustness and profitability of automated trading strategies in dynamic, volatile stock markets, this paper proposes an integrated deep reinforcement learning (DRL) trading framework that synergistically combines Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). Built upon a unified Actor-Critic architecture, the framework models continuous action spaces coherently and incorporates an on-demand data loading mechanism to mitigate memory overhead during large-scale time-series training. Its key innovation lies in an adaptive weight scheduling mechanism that dynamically balances algorithmic strengths, thereby enhancing generalization across diverse market regimes—including trending, mean-reverting, and high-volatility conditions. Empirical evaluation on 30 Dow Jones Industrial Average (DJIA) constituent stocks demonstrates that the integrated strategy achieves a significantly higher Sharpe ratio than all individual baseline DRL algorithms, as well as traditional benchmarks including the DJIA index and the minimum-variance portfolio, validating its effectiveness and practical viability.

Technology Category

Application Category

📝 Abstract
Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio. This work is fully open-sourced at href{https://github.com/AI4Finance-Foundation/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020}{GitHub}.
Problem

Research questions and friction points this paper is trying to address.

Designing profitable stock trading strategies in complex dynamic markets
Developing ensemble deep reinforcement learning for automated trading
Optimizing risk-adjusted investment returns using multiple algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble strategy combines three actor-critic algorithms
Load-on-demand technique reduces memory consumption
Deep reinforcement learning maximizes investment returns
🔎 Similar Papers
No similar papers found.
H
Hongyang Yang
AI4Finance Foundation
Xiao-Yang Liu
Xiao-Yang Liu
Columbia University
TensorDeep LearningReinforcement LearningBig Data
S
Shan Zhong
Dept. of Electrical Engineering, Columbia University
A
Anwar Walid
Mathematics of Systems Research Department, Nokia-Bell Labs