Abstract Gradient Training: A Unified Certification Framework for Data Poisoning, Unlearning, and Differential Privacy

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the challenge of certifying model robustness under training-data perturbations—specifically, adversarial data poisoning, machine unlearning (sample deletion), and differential privacy (single-point substitution). Unlike conventional certification methods that focus on input perturbations during inference, we propose the first unified, provably sound framework that models training-stage data modifications (addition, deletion, modification) as perturbations in parameter space. Leveraging first-order optimization dynamics and abstract interpretation, we symbolically characterize gradient propagation paths to rigorously derive bounds on the reachable set of model parameters. Experimental evaluation across diverse models and datasets demonstrates high-precision, cross-scenario certification, significantly broadening both the applicability scope and the strength of theoretical robustness guarantees.

Technology Category

Application Category

📝 Abstract

The impact of inference-time data perturbation (e.g., adversarial attacks) has been extensively studied in machine learning, leading to well-established certification techniques for adversarial robustness. In contrast, certifying models against training data perturbations remains a relatively under-explored area. These perturbations can arise in three critical contexts: adversarial data poisoning, where an adversary manipulates training samples to corrupt model performance; machine unlearning, which requires certifying model behavior under the removal of specific training data; and differential privacy, where guarantees must be given with respect to substituting individual data points. This work introduces Abstract Gradient Training (AGT), a unified framework for certifying robustness of a given model and training procedure to training data perturbations, including bounded perturbations, the removal of data points, and the addition of new samples. By bounding the reachable set of parameters, i.e., establishing provable parameter-space bounds, AGT provides a formal approach to analyzing the behavior of models trained via first-order optimization methods.

Problem

Research questions and friction points this paper is trying to address.

Certifying model robustness against adversarial data poisoning attacks

Providing guarantees for machine unlearning with data removal

Establishing differential privacy bounds through parameter-space analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Abstract Gradient Training bounds parameter reachable sets

Unified certification for data poisoning and unlearning

Provides provable parameter-space bounds for robustness

🔎 Similar Papers

Certification for Differentially Private Prediction in Gradient-Based Training