Wide Neural Networks as a Baseline for the Computational No-Coincidence Conjecture

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work investigates the conditions under which outputs of wide neural networks become approximately independent under random initialization, to test the Alignment Research Center’s “Computational Coincidences Conjecture”—a theoretical characterization of the limits of AI interpretability. Method: We conduct rigorous theoretical analysis in the infinite-width limit, leveraging Gaussian process modeling, asymptotic analysis for large width, and exact computation of function expectations. Contribution/Results: We prove that network outputs converge to statistical independence if and only if the activation function has zero mean under the standard Gaussian distribution. Consequently, we systematically identify common activations satisfying this condition—including ReLU, shifted GeLU, and tanh—and provide formal verification. Our results establish the zero-mean property as a necessary and sufficient criterion for output decoupling in wide networks, yielding the first empirically verifiable benchmark model for the conjecture. This advances quantitative assessment of interpretability boundaries in deep learning.

Technology Category

Application Category

📝 Abstract

We establish that randomly initialized neural networks, with large width and a natural choice of hyperparameters, have nearly independent outputs exactly when their activation function is nonlinear with zero mean under the Gaussian measure: $mathbb{E}_{z sim mathcal{N}(0,1)}[σ(z)]=0$. For example, this includes ReLU and GeLU with an additive shift, as well as tanh, but not ReLU or GeLU by themselves. Because of their nearly independent outputs, we propose neural networks with zero-mean activation functions as a promising candidate for the Alignment Research Center's computational no-coincidence conjecture -- a conjecture that aims to measure the limits of AI interpretability.

Problem

Research questions and friction points this paper is trying to address.

Analyzing output independence in wide neural networks with specific activation functions

Establishing conditions for nearly independent outputs in randomly initialized networks

Proposing zero-mean activation networks as baseline for computational no-coincidence conjecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-mean activation functions enable independence

Wide random networks achieve near-independent outputs

Modified ReLU/GeLU satisfy computational no-coincidence conjecture

🔎 Similar Papers

ReSi: A Comprehensive Benchmark for Representational Similarity Measures