Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

📅 2026-02-19

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

This study addresses the convergence of training two-layer ReLU neural networks under Gaussian random masking of inputs—a setting relevant to sensor noise, missing data, and privacy-preserving mechanisms. By leveraging neural tangent kernel (NTK) theory and providing a refined characterization of the intrinsic randomness induced by the ReLU activation under masked inputs, the work establishes the first linear convergence guarantee in this context: gradient descent converges linearly to a neighborhood of the global optimum, with the radius of this neighborhood proportional to the variance of the masking noise. This result overcomes a key technical challenge in jointly handling nonlinear activations and input randomness, offering a rigorous theoretical foundation for training neural networks with noisy or partially observed inputs.

Technology Category

Application Category

📝 Abstract

We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input training common in sensor networks, privacy-preserving training, and federated learning, where each user may have access to partial or corrupted features. Using a Neural Tangent Kernel (NTK) analysis, we demonstrate that training a two-layer ReLU network with Gaussian randomly masked inputs achieves linear convergence up to an error region proportional to the mask's variance. A key technical contribution is resolving the randomness within the non-linear activation, a problem of independent interest.

Problem

Research questions and friction points this paper is trying to address.

convergence

two-layer neural networks

Gaussian input masking

noisy input training

input dropout

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian input masking

Neural Tangent Kernel

convergence analysis