Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This paper addresses the optimal execution of large orders, aiming to minimize both implementation shortfall and market impact. We propose a data-driven, nonparametric reinforcement learning framework: first, we construct a high-fidelity limit-order-book market simulator based on a queue-reactive model that captures transient price impact and dynamic order-flow responses; second, we integrate this model with a model-free RL algorithm—specifically, Double DQN—using state features including time, inventory position, asset price, and order-book depth. Unlike conventional parametric approaches, our framework imposes no assumptions on market dynamics, enabling counterfactual policy evaluation and adaptive strategy generation. Empirical results demonstrate that the learned policy exhibits both strategic temporal planning and tactical micro-adjustments, consistently outperforming benchmark methods across diverse market scenarios.

Technology Category

Application Category

📝 Abstract

We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.

Problem

Research questions and friction points this paper is trying to address.

Optimizing large order execution while minimizing market impact and implementation shortfall

Developing model-free reinforcement learning for realistic limit order book simulations

Training adaptive trading agents to outperform traditional execution benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-free reinforcement learning for optimal execution

Queue-Reactive Model generates realistic market simulations

Double Deep Q-Network trained on multi-dimensional state space

🔎 Similar Papers

No similar papers found.