Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation

📅 2024-02-12

🏛️ International Conference on Machine Learning

📈 Citations: 3

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Existing generalization bounds for heavy-tailed stochastic optimization either rely on intractable information-theoretic quantities or yield only expectation-based guarantees. To address this, this paper establishes the first *computable, dimension-friendly, high-probability generalization bound* for heavy-tailed stochastic differential equation (SDE) optimizers. Methodologically, we introduce a novel entropy flow analysis framework grounded in the fractional-order Fokker–Planck equation, unifying heavy-tailed SDE theory with fractional PDE techniques. Our analysis reveals a structural-phase transition phenomenon: the impact of heavy tails on generalization—beneficial or detrimental—is governed by the underlying problem geometry. The resulting bound is fully computable, contains no unmeasurable terms, and exhibits improved dimension dependence compared to prior work. Extensive experiments across multiple models and datasets empirically validate the theoretical insights.

Technology Category

Application Category

📝 Abstract

Understanding the generalization properties of heavy-tailed stochastic optimization algorithms has attracted increasing attention over the past years. While illuminating interesting aspects of stochastic optimizers by using heavy-tailed stochastic differential equations as proxies, prior works either provided expected generalization bounds, or introduced non-computable information theoretic terms. Addressing these drawbacks, in this work, we prove high-probability generalization bounds for heavy-tailed SDEs which do not contain any nontrivial information theoretic terms. To achieve this goal, we develop new proof techniques based on estimating the entropy flows associated with the so-called fractional Fokker-Planck equation (a partial differential equation that governs the evolution of the distribution of the corresponding heavy-tailed SDE). In addition to obtaining high-probability bounds, we show that our bounds have a better dependence on the dimension of parameters as compared to prior art. Our results further identify a phase transition phenomenon, which suggests that heavy tails can be either beneficial or harmful depending on the problem structure. We support our theory with experiments conducted in a variety of settings.

Problem

Research questions and friction points this paper is trying to address.

Prove high-probability generalization bounds for heavy-tailed SDEs

Develop entropy flow techniques via fractional Fokker-Planck equation

Identify phase transition effects of heavy tails on optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

High-probability bounds for heavy-tailed SDEs

Entropy flows via fractional Fokker-Planck equation

Phase transition in heavy-tailed optimization effects

🔎 Similar Papers

No similar papers found.