Learning In-Distribution Representations for Anomaly Detection

📅 2025-01-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Anomaly detection in high-dimensional data suffers from intra-class dispersion of in-distribution (ID) samples and ambiguous boundaries between ID and out-of-distribution (OOD) data. To address this, we propose Focused In-Distribution Representation Modeling (FIRM), the first self-supervised contrastive learning framework specifically designed for anomaly detection. FIRM constructs a reconstruction-based contrastive task using controllably synthesized outliers—enabling joint optimization of ID intra-class compactness and ID/OOD inter-class separability without access to ground-truth OOD labels. It further incorporates geometric constraints in representation space and a multi-scoring-function-compatible architecture to enhance feature discriminability and scoring robustness. Extensive experiments on standard benchmarks demonstrate significant improvements over both conventional and supervised contrastive methods. Ablation studies confirm FIRM’s effectiveness in improving representation quality and cross-scoring generalization. The code is publicly available.

Technology Category

Application Category

📝 Abstract
Anomaly detection involves identifying data patterns that deviate from the anticipated norm. Traditional methods struggle in high-dimensional spaces due to the curse of dimensionality. In recent years, self-supervised learning, particularly through contrastive objectives, has driven advances in anomaly detection. However, vanilla contrastive learning struggles to align with the unique demands of anomaly detection, as it lacks a pretext task tailored to the homogeneous nature of In-Distribution (ID) data and the diversity of Out-of-Distribution (OOD) anomalies. Methods that attempt to address these challenges, such as introducing hard negatives through synthetic outliers, Outlier Exposure (OE), and supervised objectives, often rely on pretext tasks that fail to balance compact clustering of ID samples with sufficient separation from OOD data. In this work, we propose Focused In-distribution Representation Modeling (FIRM), a contrastive learning objective specifically designed for anomaly detection. Unlike existing approaches, FIRM incorporates synthetic outliers into its pretext task in a way that actively shapes the representation space, promoting compact clustering of ID samples while enforcing strong separation from outliers. This formulation addresses the challenges of class collision, enhancing both the compactness of ID representations and the discriminative power of the learned feature space. We show that FIRM surpasses other contrastive methods in standard benchmarks, significantly enhancing anomaly detection compared to both traditional and supervised contrastive learning objectives. Our ablation studies confirm that FIRM consistently improves the quality of representations and shows robustness across a range of scoring methods. The code is available at: https://github.com/willtl/firm.
Problem

Research questions and friction points this paper is trying to address.

Anomaly Detection
High-dimensional Data
Self-supervised Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

FIRM
Contrastive Learning
Anomaly Detection
🔎 Similar Papers
2024-05-29arXiv.orgCitations: 0
W
William T. Lunardi
Technology Innovation Institute (TII)
A
Abdulrahman Banabila
Technology Innovation Institute (TII)
Dania Herzalla
Dania Herzalla
Unknown affiliation
Computer Science
Martin Andreoni
Martin Andreoni
Technology Innovation Institute (TII)
Network SecurityIntrusion DetectionCloud ComputingSecure Autonomous Systems