Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

πŸ“… 2026-02-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the convergence of training two-layer ReLU neural networks under Gaussian random masking of inputsβ€”a setting relevant to sensor noise, missing data, and privacy-preserving mechanisms. By leveraging neural tangent kernel (NTK) theory and providing a refined characterization of the intrinsic randomness induced by the ReLU activation under masked inputs, the work establishes the first linear convergence guarantee in this context: gradient descent converges linearly to a neighborhood of the global optimum, with the radius of this neighborhood proportional to the variance of the masking noise. This result overcomes a key technical challenge in jointly handling nonlinear activations and input randomness, offering a rigorous theoretical foundation for training neural networks with noisy or partially observed inputs.

Technology Category

Application Category

πŸ“ Abstract
We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input training common in sensor networks, privacy-preserving training, and federated learning, where each user may have access to partial or corrupted features. Using a Neural Tangent Kernel (NTK) analysis, we demonstrate that training a two-layer ReLU network with Gaussian randomly masked inputs achieves linear convergence up to an error region proportional to the mask's variance. A key technical contribution is resolving the randomness within the non-linear activation, a problem of independent interest.
Problem

Research questions and friction points this paper is trying to address.

convergence
two-layer neural networks
Gaussian input masking
noisy input training
input dropout
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian input masking
Neural Tangent Kernel
convergence analysis
two-layer neural networks
ReLU activation
πŸ”Ž Similar Papers
No similar papers found.