Certification from Examples is Hard for Circuits and Transformers under Minimal Overparametrization

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

This work investigates the computational hardness of exactly verifying whether neural networks—such as deep threshold circuits and Transformers—are functionally equivalent to a target function under minimal over-parameterization and limited labeled samples. Through computational complexity analysis, constructive circuit design, and experiments on binary addition tasks, it establishes for the first time that adding merely a single logic gate or a constant-sized architectural extension can cause the certificate size required for exact verification to grow exponentially. The study theoretically proves exponential lower bounds separating exact from approximate verification and experimentally demonstrates that the constructed circuits exhibit an exponential verification barrier. Moreover, trained Transformers are shown to evade detection via large-scale random sampling, revealing that even marginal increases in model capacity fundamentally undermine verifiability.

📝 Abstract

As state-of-the-art neural networks are deployed on reasoning and algorithmic tasks, exactness guarantees become increasingly important. However, high average-case accuracy can still mask inconsistent behaviors. This motivates exact certification, which asks for the smallest set of labeled examples needed to certify that a learned hypothesis equals the target. We show that while some hypotheses are easy to certify, even minimal overparametrization can make certification exponentially hard across several hypothesis classes. For threshold circuits of depth $\ge 2$, adding a single extra gate can force certificate sizes exponential in the input dimension. We show an analogous hardness result for log-precision Transformers with only constant architectural overhead. We also characterize approximate certification, showing that allowing only polynomially many mistakes still requires exponentially large certificates, whereas constant relative-error guarantees can hide exponentially many mistakes. Empirically, we study certification for constructed circuits and trained Transformers for recognizing binary addition. While the constructed circuits instantiate the exponential barrier for certification, the trained Transformer analysis shows that imperfect models can evade detection by large uniformly sampled certificate candidates.

Problem

Research questions and friction points this paper is trying to address.

certification

overparametrization

threshold circuits

Transformers

exactness guarantees

Innovation

Methods, ideas, or system contributions that make the work stand out.

exact certification

overparametrization

threshold circuits