🤖 AI Summary
This paper introduces the “quality control problem,” a new framework for distinguishing high-quality inputs (ρ ≈ 1) from low-quality or adversarial ones (ρ ≪ 1) in sublinear time—i.e., with o(N) queries and runtime. Using the Erdős–Rényi random graph model G_{n,p}, it adopts the k-clique density ρ_k as a quality measure and designs the first sublinear-query algorithm, requiring only p^{-O(k)} edge queries to reliably verify structural quality. The approach generalizes to arbitrary motifs H, achieving query complexity p^{-O(Δ(H))}, where Δ(H) denotes H’s maximum degree. This yields superpolynomial speedups over classical property testing methods. The core conceptual innovation lies in formulating input quality verification as a distribution-aware sublinear decision problem. Tight query lower bounds and efficient algorithmic constructions are established via rigorous probabilistic analysis and combinatorial estimation.
📝 Abstract
Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call "Quality Control Problems." These problems are specified by a (positive, real-valued) "quality function" $ρ$ and a distribution $D$ such that, with high probability, a sample drawn from $D$ is "high quality," meaning its $ρ$-value is near $1$. The goal is to accept inputs $x sim D$ and reject potentially adversarially generated inputs $x$ with $ρ(x)$ far from $1$. The objective of quality control is thus weaker than either component problem: testing for "$ρ(x) approx 1$" or testing if $x sim D$, and offers the possibility of more efficient algorithms.
In this work, we consider the sublinear version of the quality control problem, where $D in Δ({0,1}^N)$ and the goal is to solve the $(D ,ρ)$-quality problem with $o(N)$ queries and time. As a case study, we consider random graphs, i.e., $D = G_{n,p}$ (and $N = inom{n}2$), and the $k$-clique count function $ρ_k := C_k(G)/mathbb{E}_{G' sim G_{n,p}}[C_k(G')]$, where $C_k(G)$ is the number of $k$-cliques in $G$. Testing if $G sim G_{n,p}$ with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if $ρ_k(G)approx 1$ requires $p^{-Ω(k^2)}$ samples. In contrast, we show that the quality control problem for $G_{n,p}$ (with $n geq p^{-ck}$ for some constant $c$) with respect to $ρ_k$ can be tested with $p^{-O(k)}$ queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif $H$ of maximum degree $Δ(H)$, the respective quality control problem can be solved with $p^{-O(Δ(H))}$ queries and running time.