Learning High-Dimensional Parity Functions with Product Networks using Gradient Descent

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the challenge that learning high-dimensional parity functions typically requires exponential sample complexity under general settings, rendering them intractable for gradient-based methods. The authors propose a novel approach combining product-type neural networks, sparse Bernoulli inputs with $p_e \leq 1/N$, and carefully tuned hyperparameters, achieving polynomial sample complexity via gradient descent for the first time in dimensions as high as $N = 10^5$. Theoretical analysis establishes a crucial connection between the network’s inductive bias and input sparsity, providing convergence guarantees. Empirical validation confirms the method’s efficacy, identifies optimal choices for $p_e$ and learning rate $\alpha$, and reveals clear polynomial scaling behavior.

📝 Abstract

Parity functions are fundamental Boolean operations with critical applications across machine learning, cryptography, and error correction. Yet, learning high-dimensional parity functions poses significant challenges: in a general setting, standard neural network architectures typically require exponential sample complexity, making gradient-based optimization intractable for large number of inputs $N$. We demonstrate that compact product-based neural architectures combined with stochastic data sparsity (Bernoulli inputs with $p_e \leq 1/N$) and appropriate hyperparameter choice enable efficient parity learning, with theoretical guarantees of convergence. Experiments validate our theory across dimensions up to $N = 100{,}000$, with empirical evidence showing optimal hyperparameter choices for $p_e$ and learning rate $α$, as well as polynomial complexity scaling laws. This work establishes fundamental connections between architectural inductive bias and data sparsity, opening new possibilities for neural arithmetic, structured reasoning, binary neural networks, and machine learning applied to automated protocol discovery.

Problem

Research questions and friction points this paper is trying to address.

parity functions

high-dimensional learning

sample complexity

gradient descent

neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

product networks

parity functions

gradient descent