Proofs as Explanations: Short Certificates for Reliable Predictions

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This paper addresses the problem of constructing concise and robust certificates for prediction interpretability in AI: given a hypothesis class ℋ and a training set containing at most b noisy points, how to identify the smallest subset S′ such that all classifiers in ℋ with error ≤ b on S′ agree on the prediction for a target input x? To this end, we introduce the *robust hollow star number*—a novel combinatorial measure characterizing the minimal certificate size. We define a distribution-dependent certificate coefficient εₓ and derive tight upper and lower bounds on the sample complexity. By unifying VC-dimension theory, Carathéodory’s theorem, and combinatorial geometry, we jointly analyze hypothesis class structure and data distribution. Our framework precisely characterizes worst-case and average-case certificate lengths for general ℋ, yields explicit upper bounds for natural classes (e.g., linear separators), and provides efficiently checkable conditions for certificate existence.

Technology Category

Application Category

📝 Abstract

We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S'$ of the training data (if it exists) such that all classifiers $h' in H$ that make at most $b$ mistakes on $S'$ predict $h'(x)=y$. Such a set $S'$ serves as a proof that $x$ indeed has label $y$ under the assumption that (1) the target function $h^star$ belongs to $H$, and (2) the set $S$ contains at most $b$ corrupted points. For example, if $b=0$ and $H$ is the family of linear classifiers in $mathbb{R}^d$, and if $x$ lies inside the convex hull of the positive data points in $S$ (and hence every consistent linear classifier labels $x$ as positive), then Carath'eodory's theorem states that $x$ lies inside the convex hull of $d+1$ of those points. So, a set $S'$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of realizability. In this work, we consider this problem more generally, for general hypothesis classes $H$ and general values $bgeq 0$. We define the notion of the robust hollow star number of $H$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as distribution-dependent bounds that we show tightly control the sample size needed to get a certificate for any given test example. In particular, we define a notion of the certificate coefficient $varepsilon_x$ of an example $x$ with respect to a data distribution $D$ and target function $h^star$, and prove matching upper and lower bounds on sample size as a function of $varepsilon_x$, $b$, and the VC dimension $d$ of $H$.

Problem

Research questions and friction points this paper is trying to address.

Finding minimal training subsets certifying reliable AI predictions

Generalizing certificate size analysis for arbitrary hypothesis classes

Establishing sample complexity bounds for robust explanation certificates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Short certificates for reliable AI predictions

Robust hollow star number generalizes hypothesis classes

Matching bounds on sample size via certificate coefficient

🔎 Similar Papers

Models That Prove Their Own Correctness