Near Optimal Algorithms for Noisy $k$-XOR under Low-Degree Heuristic

📅 2026-04-12

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work investigates the problem of recovering a hidden Boolean assignment from randomly corrupted $k$-ary XOR constraints under high noise, characterizing the trade-offs among sample complexity, noise level, and runtime. The authors propose a polynomial-time algorithm based on second-moment analysis, color coding, and dynamic programming, which constructs structured hypergraph embeddings of statistics to recover the assignment in time $n^{D+O(1)}$ given $m \geq C_k n^{k/2}/(D^{k/2-1}\delta^2)$ samples. This algorithm is the first to simultaneously match the known optimal detection sample complexity and provide a recovery guarantee, achieving information-theoretically optimal dependence on the noise bias $\delta$. Furthermore, low-degree likelihood ratio analysis demonstrates its near-optimality across a broad range of parameters.

Technology Category

Application Category

📝 Abstract

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the tradeoff among sample complexity, noise level, and running time. We give a recovery algorithm, and hence also a detection algorithm, for noisy $k$-XOR in the high-noise regime. For every parameter $D$, our algorithm runs in time $n^{D+O(1)}$ and succeeds whenever $$ m \ge C_k \frac{n^{k/2}}{D^{\,k/2-1}δ^2}, $$ where $C_k$ is an explicit constant depending only on $k$, and $δ$ is the noise bias. Our result matches the best previously known time--sample tradeoff for detection, while simultaneously yielding recovery guarantees. In addition, the dependence on the noise bias $δ$ is optimal up to constant factors, matching the information-theoretic scaling. We also prove matching low-degree lower bounds. In particular, we show that the degree-$D$ low-degree likelihood ratio has bounded $L^2$-norm below the same threshold, up to the same factor $D^{k/2-1}$. Under the low-degree heuristic, this implies that our algorithm is near-optimal over a broad range of parameters. Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics. These techniques may be of independent interest for other average-case inference problems.

Problem

Research questions and friction points this paper is trying to address.

noisy k-XOR

sample complexity

noise level

running time

low-degree heuristic

Innovation

Methods, ideas, or system contributions that make the work stand out.

noisy k-XOR

low-degree heuristic

sample complexity