π€ AI Summary
This work addresses online learning and cumulative regret control in continuous-time diffusion data streams with unknown coefficients by proposing a non-anticipative parameter update mechanism based on two-layer neural networks. The mean-field limit of this scheme corresponds to a stochastic Wasserstein gradient flow adapted to the data filtration. For the first time, mean-field neural networks are introduced into online learning under continuous-time diffusion settings, leveraging displacement convexity to establish a constant static regret bound and, in non-convex cases, an explicit linear dynamic regret bound that reveals the interplay among data evolution, entropy-driven exploration, and regularization. The theoretical analysis integrates mean-field limits, logarithmic Sobolev inequalities, and Malliavin calculus to rigorously bound the regret for both finite-particle systems and their mean-field counterparts. Numerical experiments confirm the methodβs superiority and highlight the critical roles of network width and regularization parameters.
π Abstract
We study continuous-time online learning where data are generated by a diffusion process with unknown coefficients. The learner employs a two-layer neural network, continuously updating its parameters in a non-anticipative manner. The mean-field limit of the learning dynamics corresponds to a stochastic Wasserstein gradient flow adapted to the data filtration. We establish regret bounds for both the mean-field limit and finite-particle system. Our analysis leverages the logarithmic Sobolev inequality, Polyak-Lojasiewicz condition, Malliavin calculus, and uniform-in-time propagation of chaos. Under displacement convexity, we obtain a constant static regret bound. In the general non-convex setting, we derive explicit linear regret bounds characterizing the effects of data variation, entropic exploration, and quadratic regularization. Finally, our simulations demonstrate the outperformance of the online approach and the impact of network width and regularization parameters.