Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the optimal execution of large orders, aiming to minimize both implementation shortfall and market impact. We propose a data-driven, nonparametric reinforcement learning framework: first, we construct a high-fidelity limit-order-book market simulator based on a queue-reactive model that captures transient price impact and dynamic order-flow responses; second, we integrate this model with a model-free RL algorithm—specifically, Double DQN—using state features including time, inventory position, asset price, and order-book depth. Unlike conventional parametric approaches, our framework imposes no assumptions on market dynamics, enabling counterfactual policy evaluation and adaptive strategy generation. Empirical results demonstrate that the learned policy exhibits both strategic temporal planning and tactical micro-adjustments, consistently outperforming benchmark methods across diverse market scenarios.

Technology Category

Application Category

📝 Abstract
We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.
Problem

Research questions and friction points this paper is trying to address.

Optimizing large order execution while minimizing market impact and implementation shortfall
Developing model-free reinforcement learning for realistic limit order book simulations
Training adaptive trading agents to outperform traditional execution benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-free reinforcement learning for optimal execution
Queue-Reactive Model generates realistic market simulations
Double Deep Q-Network trained on multi-dimensional state space
🔎 Similar Papers
No similar papers found.
T
Tomas Espana
ORFE, Princeton University, Princeton, NJ, USA
Y
Yadh Hafsi
CMAP, École Polytechnique, Palaiseau, France
Fabrizio Lillo
Fabrizio Lillo
Università di Bologna and Scuola Normale Superiore, Pisa
Quantitative FinanceStatistical MechanicsData Science
E
Edoardo Vittori
Intesa Sanpaolo, Milan, Italy