CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

📅 2025-11-08

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

To address the challenge of launching universal backdoor attacks on tabular data containing mixed numerical and categorical features, this paper proposes CatBack—the first differentiable backdoor attack framework tailored for heterogeneous tabular data. Its core innovation lies in a differentiable floating-point categorical encoding scheme, enabling categorical features to participate in gradient-based optimization and facilitating unified perturbation across all feature types. CatBack supports both white-box and black-box settings, integrates seamlessly with mainstream ML frameworks, and exhibits strong cross-model and cross-scenario transferability alongside high stealthiness. Evaluated on five real-world tabular datasets and four model families, CatBack achieves up to 100% attack success rate and successfully evades state-of-the-art defenses—including Spectral Signatures, Neural Cleanse, and multiple anomaly detection methods—outperforming existing approaches such as Tabdoor in both efficacy and robustness.

Technology Category

Application Category

📝 Abstract

Backdoor attacks in machine learning have drawn significant attention for their potential to compromise models stealthily, yet most research has focused on homogeneous data such as images. In this work, we propose a novel backdoor attack on tabular data, which is particularly challenging due to the presence of both numerical and categorical features. Our key idea is a novel technique to convert categorical values into floating-point representations. This approach preserves enough information to maintain clean-model accuracy compared to traditional methods like one-hot or ordinal encoding. By doing this, we create a gradient-based universal perturbation that applies to all features, including categorical ones. We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass the previous works like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms.

Problem

Research questions and friction points this paper is trying to address.

Proposes universal backdoor attacks on heterogeneous tabular data with mixed features

Develops gradient-based perturbation method for categorical and numerical features simultaneously

Achieves high attack success while evading state-of-the-art detection mechanisms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Converts categorical values to floating-point representations

Creates gradient-based universal perturbation for all features

Bypasses multiple defense mechanisms while remaining stealthy

🔎 Similar Papers

Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data