Deep Learning without Global Optimization by Random Fourier Neural Networks

📅 2024-07-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

254K/year
🤖 AI Summary
Deep neural networks suffer from reliance on gradient-based optimization and global search, poor modeling capacity for high-frequency or nonsmooth functions, and lack of parameter interpretability. To address these issues, this paper proposes the Random Fourier Neural Network (RFNN), which employs complex exponential activation functions and explicit Fourier basis representation. RFNN adopts a non-gradient, layer-wise training scheme via Markov Chain Monte Carlo (MCMC) sampling—marking the first deep architecture to completely eliminate gradient descent and global optimization. Theoretically, RFNN achieves the optimal approximation rate of ResNet without Gibbs oscillations; it efficiently captures multiscale and discontinuous features; and its parameters admit statistically interpretable posterior distributions. Empirically, RFNN maintains high accuracy and robustness on nonsmooth tasks, demonstrating superior generalization and stability compared to conventional gradient-based deep models.

Technology Category

Application Category

📝 Abstract
We introduce a new training algorithm for deep neural networks that utilize random complex exponential activation functions. Our approach employs a Markov Chain Monte Carlo sampling procedure to iteratively train network layers, avoiding global and gradient-based optimization while maintaining error control. It consistently attains the theoretical approximation rate for residual networks with complex exponential activation functions, determined by network complexity. Additionally, it enables efficient learning of multiscale and high-frequency features, producing interpretable parameter distributions. Despite using sinusoidal basis functions, we do not observe Gibbs phenomena in approximating discontinuous target functions.
Problem

Research questions and friction points this paper is trying to address.

Avoids global and gradient-based optimization in deep learning.
Achieves theoretical approximation rate for complex exponential networks.
Efficiently learns multiscale and high-frequency features without Gibbs phenomena.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random complex exponential activation functions
Markov Chain Monte Carlo sampling training
Efficient multiscale high-frequency feature learning