🤖 AI Summary
Traditional stochastic noise injection methods—such as fixed Gaussian noise or uniform dropout—suffer from limited generalization capability and lack gradient awareness. To address this, we propose a gradient-fluctuation-aware adaptive regularization method. Our approach dynamically estimates the variance and coefficient of variation of neuron-wise gradients, then constructs data-dependent random projection matrices; during training, it selectively injects stronger noise into features exhibiting high gradient fluctuation (i.e., instability), thereby enabling fine-grained, adaptive implicit regularization. To the best of our knowledge, this is the first work to explicitly leverage gradient fluctuation to guide the design of stochastic projection noise. Extensive experiments on MNIST, CIFAR-10, and SVHN demonstrate that our method consistently outperforms standard dropout, Gaussian noise injection, and fixed-strength random projection baselines—yielding significant improvements in test accuracy, model robustness, and internal stability.
📝 Abstract
We propose VISP: Volatility Informed Stochastic Projection, an adaptive regularization method that leverages gradient volatility to guide stochastic noise injection in deep neural networks. Unlike conventional techniques that apply uniform noise or fixed dropout rates, VISP dynamically computes volatility from gradient statistics and uses it to scale a stochastic projection matrix. This mechanism selectively regularizes inputs and hidden nodes that exhibit higher gradient volatility while preserving stable representations, thereby mitigating overfitting. Extensive experiments on MNIST, CIFAR-10, and SVHN demonstrate that VISP consistently improves generalization performance over baseline models and fixed-noise alternatives. In addition, detailed analyses of the evolution of volatility, the spectral properties of the projection matrix, and activation distributions reveal that VISP not only stabilizes the internal dynamics of the network but also fosters a more robust feature representation.