🤖 AI Summary
This work studies the optimal statistical distinguishing advantage of polynomial-time algorithms for the Planted Clique problem: namely, the maximum advantage achievable in distinguishing an Erdős–Rényi graph (G(n,1/2)) from a graph with a planted clique of size (k). Using the low-degree polynomial method—applied rigorously to this problem for the first time—we establish a tight asymptotic characterization of this advantage as (Theta(k^2/n)). We construct a novel “hard” instance distribution that is strictly more difficult to distinguish than the standard planted clique distribution, thereby strengthening lower-bound evidence. Moreover, we show that the classical edge-counting algorithm achieves this tight bound, and prove a strong computational hardness result: any distinguishing advantage smaller than (omega(1)) (i.e., super-constant advantage) is computationally infeasible. Collectively, these results precisely delineate the fundamental statistical-computational trade-off inherent in the problem.
📝 Abstract
In a distinguishing problem, the input is a sample drawn from one of two distributions and the algorithm is tasked with identifying the source distribution. The performance of a distinguishing algorithm is measured by its advantage, i.e., its incremental probability of success over a random guess. A classic example of a distinguishing problem is the Planted Clique problem, where the input is a graph sampled from either $G(n,1/2)$ -- the standard ErdH{o}s-R'{e}nyi model, or $G(n,1/2,k)$ -- the ErdH{o}s-R'{e}nyi model with a clique planted on a random subset of $k$ vertices. The Planted Clique Hypothesis asserts that efficient algorithms cannot achieve advantage better than some absolute constant, say $1/4$, whenever $k=n^{1/2-Omega(1)}$. In this work, we aim to precisely understand the optimal distinguishing advantage achievable by efficient algorithms on Planted Clique. We show the following results under the Planted Clique hypothesis: 1. Optimality of low-degree polynomials: The optimal advantage achievable by any efficient algorithm is bounded by the low-degree advantage, which is the advantage achievable by low-degree polynomials of the input. The low-degree advantage is roughly $k^2/(sqrt{2}n)$. Conversely, a simple edge-counting algorithm achieves advantage $k^2/(sqrt{pi}n)$, showing that our bound is tight up to a small constant factor. 2. Harder planted distributions: There is an efficiently sampleable distribution $mathcal{P}^*$ supported on graphs containing $k$-cliques such that no efficient algorithm can distinguish $mathcal{P}^*$ from $G(n,1/2)$ with advantage $n^{-d}$ for an arbitrarily large constant $d$. In other words, there exist alternate planted distributions that are much harder than $G(n,1/2,k)$.