Robust Novelty Detection through Style-Conscious Feature Ranking

📅 2023-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Novelty detection under distribution shift faces the critical challenge of distinguishing semantic content changes from spurious style variations. Method: We formally define the semantic-style disentanglement problem and propose a style-aware feature ranking mechanism based on cross-environment feature distribution distances (e.g., Wasserstein distance) to suppress style-correlated features. This enables interpretable feature reweighting and selection. Leveraging robust representations from large-scale pretrained models, our approach operates without requiring labeled anomalies or domain annotations. Contribution/Results: Evaluated on multi-domain generalization benchmarks and synthetic datasets, our method significantly improves novelty detection accuracy while effectively disentangling style shifts from semantic anomalies. It provides principled, interpretable feature attribution and demonstrates strong generalization across heterogeneous domains. The implementation is publicly available.
📝 Abstract
Novelty detection seeks to identify samples deviating from a known distribution, yet data shifts in a multitude of ways, and only a few consist of relevant changes. Aligned with out-of-distribution generalization literature, we advocate for a formal distinction between task-relevant semantic or content changes and irrelevant style changes. This distinction forms the basis for robust novelty detection, emphasizing the identification of semantic changes resilient to style distributional shifts. To this end, we introduce Stylist, a method that utilizes pretrained large-scale model representations to selectively discard environment-biased features. By computing per-feature scores based on feature distribution distances between environments, Stylist effectively eliminates features responsible for spurious correlations, enhancing novelty detection performance. Evaluations on adapted domain generalization datasets and a synthetic dataset demonstrate Stylist's efficacy in improving novelty detection across diverse datasets with stylistic and content shifts. The code is available at https://github.com/bit-ml/Stylist.
Problem

Research questions and friction points this paper is trying to address.

Novelty Detection
Content-Style Separation
Sample Recognition Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stylist tool
feature ranking
style variation removal
🔎 Similar Papers
2024-09-12arXiv.orgCitations: 0
2024-06-17arXiv.orgCitations: 1