🤖 AI Summary
This work addresses the challenge of modeling heteroskedasticity and stylized facts—such as volatility clustering, heavy tails, and the leverage effect—in financial time series generation using diffusion models. To this end, we propose a novel diffusion-based generative framework that integrates geometric Brownian motion (GBM) into the forward noising process. Specifically, GBM induces proportional noise injection on asset prices, yielding a variance-exploding stochastic differential equation (SDE) for log-price dynamics, thereby enabling intrinsic modeling of key financial phenomena. The reverse process employs conditional score matching jointly optimized with a Transformer architecture. Experiments on real stock market data demonstrate that our model significantly outperforms existing baselines, faithfully reproducing critical financial statistics—including autocorrelation of squared returns, tail exponents, and asymmetric volatility responses. Our approach establishes a new paradigm for financial synthetic data generation, reconciling theoretical soundness with empirical effectiveness.
📝 Abstract
We propose a novel diffusion-based generative framework for financial time series that incorporates geometric Brownian motion (GBM), the foundation of the Black--Scholes theory, into the forward noising process. Unlike standard score-based models that treat price trajectories as generic numerical sequences, our method injects noise proportionally to asset prices at each time step, reflecting the heteroskedasticity observed in financial time series. By accurately balancing the drift and diffusion terms, we show that the resulting log-price process reduces to a variance-exploding stochastic differential equation, aligning with the formulation in score-based generative models. The reverse-time generative process is trained via denoising score matching using a Transformer-based architecture adapted from the Conditional Score-based Diffusion Imputation (CSDI) framework. Empirical evaluations on historical stock data demonstrate that our model reproduces key stylized facts heavy-tailed return distributions, volatility clustering, and the leverage effect more realistically than conventional diffusion models.