🤖 AI Summary
This work addresses the robustness evaluation of neural classifiers on tabular data by proposing the first adversarial attack framework that incorporates automatically mined database integrity constraints—derived from SQL schema analysis—into the attack pipeline. To this end, we introduce TabPGD, a novel projected gradient descent algorithm that jointly integrates constraint-guided projection for feasibility repair, gradient-based feature selection, and perturbation clipping. TabPGD generates highly feasible adversarial examples while respecting semantic integrity constraints and minimizing perturbation cost. Experiments across three real-world tabular datasets and two neural model architectures demonstrate that the mined constraints achieve superior soundness and completeness compared to baseline approaches. Furthermore, TabPGD significantly improves attack success rates under feasibility constraints, reduces the average number of perturbed features by 37%, and decreases the L₂ norm of perturbations by 52%.
📝 Abstract
This work presents CaFA, a system for Cost-aware Feasible Attacks for assessing the robustness of neural tabular classifiers against adversarial examples realizable in the problem space, while minimizing adversaries’ effort. To this end, CaFA leverages TabPGD—an algorithm we set forth to generate adversarial perturbations suitable for tabular data— and incorporates integrity constraints automatically mined by state-of-the-art database methods. After producing adversarial examples in the feature space via TabPGD, CaFA projects them on the mined constraints, leading, in turn, to better attack realizability. We tested CaFA with three datasets and two architectures and found, among others, that the constraints we use are of higher quality (measured via soundness and completeness) than ones employed in prior work. Moreover, CaFA achieves higher feasible success rates—i.e., it generates adversarial examples that are often misclassified while satisfying constraints—than prior attacks while simultaneously perturbing few features with lower magnitudes, thus saving effort and improving inconspicuousness. We open-source CaFA,1 hoping it will serve as a generic system enabling machine-learning engineers to assess their models’ robustness against realizable attacks, thus advancing deployed models’ trustworthiness.