Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

234K/year
🤖 AI Summary
This work addresses the challenge of precisely verifying softmax-based attention mechanisms in Transformers under input interval constraints, where existing approaches suffer from overly conservative errors due to independent relaxations. The authors introduce Vertex-Softmax, a novel primitive that establishes—for the first time—that the optimal solution of softmax over a bounded input box always occurs at a vertex. Leveraging this insight, they construct a threshold structure based on sorted objective coefficients, yielding only a linear number of candidate solutions and enabling the tightest sound upper bound using solely score intervals. Integrated with vertex optimization, the threshold structure theorem, and a CROWN-style convex relaxation framework, the method provides formal correctness guarantees. Experiments on MNIST, Fashion-MNIST, and CIFAR-10 demonstrate significantly improved certified accuracy, tighter lower bounds, and superior or comparable performance to alpha-CROWN and branch-and-bound baselines at lower computational cost.
📝 Abstract
Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving avoidable slack. We prove that the exact optimum of this score-box problem is attained at a vertex of the constraint box, and establish a threshold structure theorem showing that, after sorting the objective coefficients, the optimum lies among only linearly many candidates, yielding the Vertex-Softmax primitive with log-linear complexity in the sequence length. We further prove a formal optimality result showing that Vertex-Softmax is the tightest sound bound obtainable from score intervals alone, characterizing precisely what additional structure (score correlations, score-value coupling) is needed for further improvement. Integrated into a CROWN Convex Relaxation based Optimization for Worst-case Neurons)-style verifier with a formal soundness guarantee, Vertex-Softmax significantly improves certified rates and substantially tightens lower bounds across MNIST, Fashion-MNIST, and CIFAR-10 attention models, while consistently matching or outperforming alpha-CROWN and branch-and-bound baselines at a fraction of their cost.
Problem

Research questions and friction points this paper is trying to address.

Transformer verification
softmax optimization
interval constraints
certified robustness
attention mechanism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vertex-Softmax
Transformer verification
exact softmax optimization
convex relaxation
certified robustness
🔎 Similar Papers
No similar papers found.