Realistic Market Impact Modeling for Reinforcement Learning Trading Environments

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the common oversight in existing reinforcement learning trading environments, which often neglect or oversimplify transaction costs, leading to strategies that fail in real-world deployment. Building upon the Almgren-Chriss framework and the square-root market impact law, this work proposes three open-source, Gymnasium-compatible trading environments that, for the first time, systematically incorporate empirically validated nonlinear market impact models. These environments support modular cost structures, exponentially decaying permanent impact, and fine-grained logging. Integrated with FinRL-Meta extensions and Optuna-based hyperparameter optimization, five state-of-the-art deep reinforcement learning algorithms are evaluated on NASDAQ-100 data. Results demonstrate that adopting the proposed model reduces average daily trading costs from $200,000 to $8,000 and turnover from 19% to 1%; hyperparameter optimization further cuts costs by up to 82%, with algorithm performance shown to be highly sensitive to the fidelity of cost modeling.
📝 Abstract
Reinforcement learning (RL) has shown promise for trading, yet most open-source backtesting environments assume negligible or fixed transaction costs, causing agents to learn trading behaviors that fail under realistic execution. We introduce three Gymnasium-compatible trading environments -- MACE (Market-Adjusted Cost Execution) stock trading, margin trading, and portfolio optimization -- that integrate nonlinear market impact models grounded in the Almgren-Chriss framework and the empirically validated square-root impact law. Each environment provides pluggable cost models, permanent impact tracking with exponential decay, and comprehensive trade-level logging. We evaluate five DRL algorithms (A2C, PPO, DDPG, SAC, TD3) on the NASDAQ-100, comparing a fixed 10 bps baseline against the AC model with Optuna-tuned hyperparameters. Our results show that (i) the cost model materially changes both absolute performance and the relative ranking of algorithms across all three environments; (ii) the AC model produces dramatically different trading behavior, e.g., daily costs dropping from $200k to $8k with turnover falling from 19% to 1%; (iii) hyperparameter optimization is essential for constraining pathological trading, with costs dropping up to 82%; and (iv) algorithm-cost model interactions are strongly environment-specific, e.g., DDPG's OOS Sharpe jumps from -2.1 to 0.3 under AC in margin trading while SAC's drops from -0.5 to -1.2. We release the full suite as an open-source extension to FinRL-Meta.
Problem

Research questions and friction points this paper is trying to address.

market impact
reinforcement learning
trading environment
transaction costs
backtesting
Innovation

Methods, ideas, or system contributions that make the work stand out.

market impact modeling
reinforcement learning trading
Almgren-Chriss framework
nonlinear transaction costs
Gymnasium-compatible environment
🔎 Similar Papers
No similar papers found.