Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series

📅 2026-02-09

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the limitations of conventional Transformers in modeling multichannel medical time series, where decentralized attention mechanisms struggle to capture global synchronization and consistent waveform patterns across channels. To overcome this, the authors propose CoTAR, a novel module that introduces a centralized interaction mechanism by leveraging a single global core token to aggregate and redistribute information from all channels. Replacing self-attention with a multilayer perceptron (MLP), CoTAR aligns with the inherently centralized nature of physiological signals while achieving linear computational complexity. Extensive experiments demonstrate that CoTAR significantly outperforms existing methods across five benchmark datasets, yielding an 11.6% absolute accuracy improvement on APAVA, reducing memory consumption to 33% of baseline models, and accelerating inference to just 20% of the original runtime.

📝 Abstract

Accurate analysis of medical time series (MedTS) data, such as electroencephalography (EEG) and electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibit two critical patterns: temporal dependencies within individual channels and channel dependencies across multiple channels. While recent advances in deep learning have leveraged Transformer-based models to effectively capture temporal dependencies, they often struggle with modeling channel dependencies. This limitation stems from a structural mismatch: MedTS signals are inherently centralized, whereas the Transformer's attention mechanism is decentralized, making it less effective at capturing global synchronization and unified waveform patterns. To address this mismatch, we propose CoTAR (Core Token Aggregation-Redistribution), a centralized MLP-based module designed to replace decentralized attention. Instead of allowing all tokens to interact directly, as in standard attention, CoTAR introduces a global core token that serves as a proxy to facilitate inter-token interactions, thereby enforcing a centralized aggregation and redistribution strategy. This design not only better aligns with the centralized nature of MedTS signals but also reduces computational complexity from quadratic to linear. Experiments on five benchmarks validate the superiority of our method in both effectiveness and efficiency, achieving up to a 11.6% improvement on the APAVA dataset, while using only 33% of the memory and 20% of the inference time compared to the previous state of the art. Code and all training scripts are available at https://github.com/Levi-Ackman/TeCh.

Problem

Research questions and friction points this paper is trying to address.

medical time series

channel dependencies

centralized signals

Transformer attention

temporal dependencies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Centralized Attention

Medical Time Series

Transformer Architecture