๐ค AI Summary
Existing robustness certification methods against data poisoning attacks suffer from poor scalability and non-convergence, as conventional interval- or polyhedral-based approaches yield one-directionally expanding, increasingly loose bounds. Method: This paper introduces, for the first time, bilinear mixed-integer programming (BMIP) into poisoning-robustness certification, proposing an iterative convex relaxation framework to model the parameter reachable set during trainingโyielding verifiable, tight, and deterministic bounds on parameter evolution trajectories. Contribution/Results: The approach overcomes the convergence bottleneck inherent in prior certification frameworks and eliminates bound divergence. Empirical evaluation demonstrates substantial gains in certification precision, delivering the first provably sound and controllably tight robustness guarantee against data poisoning attacks for AI systems.
๐ Abstract
Data poisoning attacks pose one of the biggest threats to modern AI systems, necessitating robust defenses. While extensive efforts have been made to develop empirical defenses, attackers continue to evolve, creating sophisticated methods to circumvent these measures. To address this, we must move beyond empirical defenses and establish provable certification methods that guarantee robustness. This paper introduces a novel certification approach, BiCert, using Bilinear Mixed Integer Programming (BMIP) to compute sound deterministic bounds that provide such provable robustness. Using BMIP, we compute the reachable set of parameters that could result from training with potentially manipulated data. A key element to make this computation feasible is to relax the reachable parameter set to a convex set between training iterations. At test time, this parameter set allows us to predict all possible outcomes, guaranteeing robustness. BiCert is more precise than previous methods, which rely solely on interval and polyhedral bounds. Crucially, our approach overcomes the fundamental limitation of prior approaches where parameter bounds could only grow, often uncontrollably. We show that BiCert's tighter bounds eliminate a key source of divergence issues, resulting in more stable training and higher certified accuracy.