From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This study addresses the limited trustworthiness of EEG foundation models (EEG-FMs) in clinical diagnosis and brain–computer interfaces due to their opaque decision-making. For the first time, the authors extend attention-aware Layerwise Relevance Propagation (LRP) to Transformer-based EEG-FMs, enabling post-hoc attribution analysis. The proposed method not only identifies "Clever Hans" behaviors—where models rely on spurious signal correlations—but also uncovers paradigm confounds in motor imagery tasks and reveals a stable association between central electrode clusters and arousal levels in affective prediction. These findings demonstrate that the LRP framework simultaneously supports model validation and hypothesis generation in neuroscience, offering a novel pathway toward interpretable EEG-FMs and scientific discovery.

📝 Abstract

Emerging foundation models (FMs) in electroencephalography (EEG) promise a path to scale deep learning in diagnostics and brain-computer interfaces despite data scarcity, yet their opaque nature remains a barrier to wider adoption. We investigate attention-aware Layer-wise relevance propagation (LRP) as a post-hoc attribution method for EEG-FMs, extending LRP's use on convolutional neural network (CNN)-based EEG models to the Transformer architectures that current FMs are based on. We find that LRP can both verify EEG-FM decisions and surface novel, biologically plausible hypotheses from them. In motor imagery, it unmasks 'Clever Hans' behavior where models prioritize task correlated ocular signals over the intended motor correlates. In a naturalistic paradigm for affect prediction, it reveals a recurring reliance on a central electrode cluster, suggesting a candidate sensorimotor signature of arousal. Though heatmap interpretation remains ambiguous in this complex domain, the results position LRP as a tool for both verification and exploration of EEG-FMs, a role that will grow in both importance and discovery potential as the underlying models mature.

Problem

Research questions and friction points this paper is trying to address.

EEG foundation models

interpretability

Transformer

attribution methods

Clever Hans

Innovation

Methods, ideas, or system contributions that make the work stand out.

Layer-wise Relevance Propagation

EEG foundation models

Transformer interpretability