LEFT: Learnable Fusion of Tri-view Tokens for Unsupervised Time Series Anomaly Detection

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work addresses the challenge of detecting subtle anomalies in unsupervised time series anomaly detection (TSAD), which are often indiscernible within a single view. The authors propose a multi-view fusion framework that models anomalies as cross-view inconsistencies by integrating temporal, frequency, and multi-scale perspectives through feature tokens. Key innovations include a learnable Nyquist-constrained spectral filter to generate multi-resolution signals, an adaptive token fusion mechanism across the three views, and a novel combination of time-frequency cycle consistency constraints with multi-scale reconstruction objectives—explicitly enforcing analysis-synthesis consistency in unsupervised TSAD for the first time. Evaluated on real-world benchmarks, the method achieves state-of-the-art accuracy while reducing computational cost by 5× in FLOPs and accelerating training by 8×.

Technology Category

Application Category

📝 Abstract

As a fundamental data mining task, unsupervised time series anomaly detection (TSAD) aims to build a model for identifying abnormal timestamps without assuming the availability of annotations. A key challenge in unsupervised TSAD is that many anomalies are too subtle to exhibit detectable deviation in any single view (e.g., time domain), and instead manifest as inconsistencies across multiple views like time, frequency, and a mixture of resolutions. However, most cross-view methods rely on feature or score fusion and do not enforce analysis-synthesis consistency, meaning the frequency branch is not required to reconstruct the time signal through an inverse transform, and vice versa. In this paper, we present Learnable Fusion of Tri-view Tokens (LEFT), a unified unsupervised TSAD framework that models anomalies as inconsistencies across complementary representations. LEFT learns feature tokens from three views of the same input time series: frequency-domain tokens that embed periodicity information, time-domain tokens that capture local dynamics, and multi-scale tokens that learns abnormal patterns at varying time series granularities. By learning a set of adaptive Nyquist-constrained spectral filters, the original time series is rescaled into multiple resolutions and then encoded, allowing these multi-scale tokens to complement the extracted frequency- and time-domain information. When generating the fused representation, we introduce a novel objective that reconstructs fine-grained targets from coarser multi-scale structure, and put forward an innovative time-frequency cycle consistency constraint to explicitly regularize cross-view agreement. Experiments on real-world benchmarks show that LEFT yields the best detection accuracy against SOTA baselines, while achieving a 5x reduction on FLOPs and 8x speed-up for training.

Problem

Research questions and friction points this paper is trying to address.

unsupervised time series anomaly detection

multi-view inconsistency

time-frequency representation

anomaly detection

cross-view agreement

Innovation

Methods, ideas, or system contributions that make the work stand out.

tri-view tokens

time-frequency cycle consistency

Nyquist-constrained spectral filters