Statistical Learning with Sublinear Regret of Propagator Models

📅 2023-01-12
🏛️ Social Science Research Network
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies sequential sales decision-making in high-frequency trading under an unknown price-impact propagation kernel: an agent dynamically learns both instantaneous and transient market impacts from observed price trajectories, without prior knowledge of market microstructure, while jointly optimizing revenue and risk control. Methodologically, we propose the first nonparametric framework for estimating the propagation kernel, solving the associated inverse problem via Tikhonov regularization, and design an online learning algorithm that alternates between exploration and exploitation. Theoretically, we establish a regression estimation theory for non-Markovian signal processes, derive sharp convergence rates for the kernel estimator, prove that the regret during the exploitation phase is (O(sqrt{T})), and show that the overall algorithm achieves sublinear cumulative regret with high probability.
📝 Abstract
We consider a class of learning problems in which an agent liquidates a risky asset while creating both transient price impact driven by an unknown convolution propagator and linear temporary price impact with an unknown parameter. We characterize the trader's performance as maximization of a revenue-risk functional, where the trader also exploits available information on a price predicting signal. We present a trading algorithm that alternates between exploration and exploitation phases and achieves sublinear regrets with high probability. For the exploration phase we propose a novel approach for non-parametric estimation of the price impact kernel by observing only the visible price process and derive sharp bounds on the convergence rate, which are characterised by the singularity of the propagator. These kernel estimation methods extend existing methods from the area of Tikhonov regularisation for inverse problems and are of independent interest. The bound on the regret in the exploitation phase is obtained by deriving stability results for the optimizer and value function of the associated class of infinite-dimensional stochastic control problems. As a complementary result we propose a regression-based algorithm to estimate the conditional expectation of non-Markovian signals and derive its convergence rate.
Problem

Research questions and friction points this paper is trying to address.

Optimal Selling Strategy
Risk-Adjusted Returns
Market Dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tikhonov regularization
market rule learning
complex signal forecasting
🔎 Similar Papers
No similar papers found.
Eyal Neuman
Eyal Neuman
Imperial College London
Stochastic processesMathematical finance
Y
Yufei Zhang
Department of Mathematics, Imperial College London