Reinforcement Learning for Jump-Diffusions

📅 2024-05-26
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
This paper addresses continuous-time stochastic control under jump-diffusion financial processes—incorporating sudden, discontinuous risks—and proposes the first entropy-regularized exploratory reinforcement learning (RL) framework for this setting. Methodologically, it unifies the modeling of both diffusive and jump dynamics, enabling direct transfer of policy evaluation and Q-learning algorithms; theoretically, it shows that standard RL algorithms remain applicable without modification, requiring only jump-aware parameterization of the actor-critic architecture. Key contributions are: (1) the first extension of entropy-regularized exploratory control to jump-diffusion systems; (2) the discovery of “jump invariance” in mean-variance portfolio optimization and option hedging—i.e., optimal policies exhibit robustness to jump intensity; and (3) empirical validation of the framework’s transferability and effectiveness in high-frequency and event-driven financial scenarios, establishing a novel paradigm for intelligent financial decision-making.

Technology Category

Application Category

📝 Abstract
We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exploration--exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al. (2020), the derivation of the exploratory dynamics under jump-diffusions calls for a careful formulation of the jump part. Through a theoretical analysis, we find that one can simply use the same policy evaluation and $q$-learning algorithms in Jia and Zhou (2022a, 2023), originally developed for controlled diffusions, without needing to check a priori whether the underlying data come from a pure diffusion or a jump-diffusion. However, we show that the presence of jumps ought to affect parameterizations of actors and critics in general. We investigate as an application the mean--variance portfolio selection problem with stock price modelled as a jump-diffusion, and show that both RL algorithms and parameterizations are invariant with respect to jumps. Finally, we present a detailed study on applying the general theory to option hedging.
Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning
Jump-Diffusion Processes
Financial Decision Making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous-Time Reinforcement Learning
Jump-Diffusion Processes
Financial Decision-Making
🔎 Similar Papers
No similar papers found.
X
Xuefeng Gao
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China
L
Lingfei Li
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China
X
Xun Yu Zhou
Department of Industrial Engineering and Operations Research and The Data Science Institute, Columbia University, New York, NY 10027, USA