Reinforcement Learning for Jump-Diffusions

📅 2024-05-26

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This paper addresses continuous-time stochastic control under jump-diffusion financial processes—incorporating sudden, discontinuous risks—and proposes the first entropy-regularized exploratory reinforcement learning (RL) framework for this setting. Methodologically, it unifies the modeling of both diffusive and jump dynamics, enabling direct transfer of policy evaluation and Q-learning algorithms; theoretically, it shows that standard RL algorithms remain applicable without modification, requiring only jump-aware parameterization of the actor-critic architecture. Key contributions are: (1) the first extension of entropy-regularized exploratory control to jump-diffusion systems; (2) the discovery of “jump invariance” in mean-variance portfolio optimization and option hedging—i.e., optimal policies exhibit robustness to jump intensity; and (3) empirical validation of the framework’s transferability and effectiveness in high-frequency and event-driven financial scenarios, establishing a novel paradigm for intelligent financial decision-making.

Technology Category

Application Category

📝 Abstract

We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exploration--exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al. (2020), the derivation of the exploratory dynamics under jump-diffusions calls for a careful formulation of the jump part. Through a theoretical analysis, we find that one can simply use the same policy evaluation and $q$-learning algorithms in Jia and Zhou (2022a, 2023), originally developed for controlled diffusions, without needing to check a priori whether the underlying data come from a pure diffusion or a jump-diffusion. However, we show that the presence of jumps ought to affect parameterizations of actors and critics in general. We investigate as an application the mean--variance portfolio selection problem with stock price modelled as a jump-diffusion, and show that both RL algorithms and parameterizations are invariant with respect to jumps. Finally, we present a detailed study on applying the general theory to option hedging.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Jump-Diffusion Processes

Financial Decision Making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous-Time Reinforcement Learning

Jump-Diffusion Processes

Financial Decision-Making

🔎 Similar Papers

No similar papers found.