UTOPIA: Unlearnable Tabular Data via Decoupled Shortcut Embedding

📅 2026-02-07

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

Existing unlearning methods exhibit limited efficacy on sensitive tabular data, such as those in finance and healthcare. This work proposes the first decoupled unlearning mechanism tailored to the intrinsic characteristics of tabular data. By leveraging feature saliency analysis, it disentangles high-saliency semantic features from low-saliency redundant ones—using the former for semantic obfuscation and embedding the latter with hyper-correlated shortcuts. Guided by spectral dominance theory, the method constructs constraint-aware dominant shortcuts to achieve certified unlearnability. Extensive experiments across multiple tabular datasets and model architectures demonstrate that the proposed approach reduces unauthorized model performance to near-random levels, substantially outperforming existing techniques while maintaining strong cross-architecture transferability.

Technology Category

Application Category

📝 Abstract

Unlearnable examples (UE) have emerged as a practical mechanism to prevent unauthorized model training on private vision data, while extending this protection to tabular data is nontrivial. Tabular data in finance and healthcare is highly sensitive, yet existing UE methods transfer poorly because tabular features mix numerical and categorical constraints and exhibit saliency sparsity, with learning dominated by a few dimensions. Under a Spectral Dominance condition, we show certified unlearnability is feasible when the poison spectrum overwhelms the clean semantic spectrum. Guided by this, we propose Unlearnable Tabular Data via DecOuPled Shortcut EmbeddIng (UTOPIA), which exploits feature redundancy to decouple optimization into two channels: high saliency features for semantic obfuscation and low saliency redundant features for embedding a hyper correlated shortcut, yielding constraint-aware dominant shortcuts while preserving tabular validity. Extensive experiments across tabular datasets and models show UTOPIA drives unauthorized training toward near random performance, outperforming strong UE baselines and transferring well across architectures.

Problem

Research questions and friction points this paper is trying to address.

unlearnable examples

tabular data

data privacy

spectral dominance

feature constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

unlearnable examples

tabular data

decoupled optimization