๐ค AI Summary
This work addresses the practical limitations of existing brainโcomputer interface models, which are hindered by reliance on expensive hardware, large parameter counts, and poor adaptability to new users. To overcome these challenges, the authors propose a lightweight, multi-subject EEG-to-image decoding framework that achieves efficient fine-tuning with only 15 minutes of new user data. The approach integrates a unified spatiotemporal backbone with a multi-subject latent alignment layer and an MLP projector to map raw EEG signals into a visual latent space. Despite using less than 1% of the parameters of prior methods, the model attains state-of-the-art performance on both the THINGS-EEG2 and AllJoined-1.6M benchmarks. Furthermore, it pioneers the use of human behavioral ratings for evaluation, significantly improving image reconstruction quality and inference efficiency across both research-grade and consumer-grade EEG devices.
๐ Abstract
To be practical for real-life applications, models for brain-computer interfaces must be easily and quickly deployable on new subjects, effective on affordable scanning hardware, and small enough to run locally on accessible computing resources. To directly address these current limitations, we introduce ENIGMA, a multi-subject electroencephalography (EEG)-to-Image decoding model that reconstructs seen images from EEG recordings and achieves state-of-the-art (SOTA) performance on the research-grade THINGS-EEG2 and consumer-grade AllJoined-1.6M benchmarks, while fine-tuning effectively on new subjects with as little as 15 minutes of data. ENIGMA boasts a simpler architecture and requires less than 1% of the trainable parameters necessary for previous approaches. Our approach integrates a subject-unified spatio-temporal backbone along with a set of multi-subject latent alignment layers and an MLP projector to map raw EEG signals to a rich visual latent space. We evaluate our approach using a broad suite of image reconstruction metrics that have been standardized in the adjacent field of fMRI-to-Image research, and we describe the first EEG-to-Image study to conduct extensive behavioral evaluations of our reconstructions using human raters. Our simple and robust architecture provides a significant performance boost across both research-grade and consumer-grade EEG hardware, and a substantial improvement in fine-tuning efficiency and inference cost. Finally, we provide extensive ablations to determine the architectural choices most responsible for our performance gains in both single and multi-subject cases across multiple benchmark datasets. Collectively, our work provides a substantial step towards the development of practical brain-computer interface applications.