Adversarial Subspace Generation for Outlier Detection in High-Dimensional Data

πŸ“… 2025-04-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Anomaly detection in high-dimensional tabular data is challenged by multi-view effects, where heterogeneous feature subsets exhibit distinct statistical behaviors. Existing methods lack a formal theoretical foundation for characterizing such multi-view structure and rely on heuristic subspace search. Method: This paper introduces Myopic Subspace Theoryβ€”a novel theoretical framework that formally defines multi-view structure and formulates subspace discovery as a differentiable, end-to-end stochastic optimization problem. Building upon it, we propose V-GAN, a generative method that directly learns discriminative low-dimensional subspaces while preserving structural fidelity and statistical consistency; we further design a subspace-driven one-class classification ensemble. Results: Evaluated on 42 real-world datasets, our approach achieves significant improvements in anomaly detection performance. Synthetic experiments demonstrate superior subspace identification accuracy and scalability compared to state-of-the-art subspace selection, feature selection, and embedding methods.

Technology Category

Application Category

πŸ“ Abstract
Outlier detection in high-dimensional tabular data is challenging since data is often distributed across multiple lower-dimensional subspaces -- a phenomenon known as the Multiple Views effect (MV). This effect led to a large body of research focused on mining such subspaces, known as subspace selection. However, as the precise nature of the MV effect was not well understood, traditional methods had to rely on heuristic-driven search schemes that struggle to accurately capture the true structure of the data. Properly identifying these subspaces is critical for unsupervised tasks such as outlier detection or clustering, where misrepresenting the underlying data structure can hinder the performance. We introduce Myopic Subspace Theory (MST), a new theoretical framework that mathematically formulates the Multiple Views effect and writes subspace selection as a stochastic optimization problem. Based on MST, we introduce V-GAN, a generative method trained to solve such an optimization problem. This approach avoids any exhaustive search over the feature space while ensuring that the intrinsic data structure is preserved. Experiments on 42 real-world datasets show that using V-GAN subspaces to build ensemble methods leads to a significant increase in one-class classification performance -- compared to existing subspace selection, feature selection, and embedding methods. Further experiments on synthetic data show that V-GAN identifies subspaces more accurately while scaling better than other relevant subspace selection methods. These results confirm the theoretical guarantees of our approach and also highlight its practical viability in high-dimensional settings.
Problem

Research questions and friction points this paper is trying to address.

Challenges in outlier detection due to high-dimensional data subspaces
Heuristic-driven subspace selection struggles to capture true data structure
Proposes Myopic Subspace Theory and V-GAN for accurate subspace identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Myopic Subspace Theory (MST) framework
Proposes V-GAN for stochastic optimization problem
Avoids exhaustive search, preserves data structure
πŸ”Ž Similar Papers
No similar papers found.
J
Jose Cribeiro-Ramallo
Karlsruhe Institute of Technology
F
Federico Matteucci
Karlsruhe Institute of Technology
P
Paul Enciu
Karlsruhe Institute of Technology
A
Alexander Jenke
Karlsruhe Institute of Technology
V
Vadim Arzamasov
Karlsruhe Institute of Technology
Thorsten Strufe
Thorsten Strufe
Professor of Privacy and Security, Karlsruhe Institute of Technology; Adjunct Professor TU Dresden
PrivacySecurityNetworksUser Analysis
K
Klemens Bohm
Karlsruhe Institute of Technology