đ€ AI Summary
This work addresses the need for non-invasive speech decoding in individuals with paralysis and dysarthria. Method: We introduce the first large-scale, standardized MEG-based speech neural decoding benchmark framework, built upon LibriBrainâthe largest within-subject MEG speech dataset to date. It defines two core tasksâspeech detection and phoneme classificationâand establishes both standard and extended tracks to foster algorithmic innovation and compute-driven advancement. We release pnpl, the first end-to-end open-source toolkit for MEG speech decoding, integrating CNNs, Transformers, and other architectures, with unified support for data loading, training, and evaluation, alongside baseline models and a public leaderboard. Contribution/Results: The framework establishes an âImageNet momentâ-level benchmark for non-invasive speech BCIs, significantly lowering entry barriers and accelerating the development of clinically deployable brainâcomputer interfaces.
đ Abstract
The advance of speech decoding from non-invasive brain data holds the potential for profound societal impact. Among its most promising applications is the restoration of communication to paralysed individuals affected by speech deficits such as dysarthria, without the need for high-risk surgical interventions. The ultimate aim of the 2025 PNPL competition is to produce the conditions for an"ImageNet moment"or breakthrough in non-invasive neural decoding, by harnessing the collective power of the machine learning community. To facilitate this vision we present the largest within-subject MEG dataset recorded to date (LibriBrain) together with a user-friendly Python library (pnpl) for easy data access and integration with deep learning frameworks. For the competition we define two foundational tasks (i.e. Speech Detection and Phoneme Classification from brain data), complete with standardised data splits and evaluation metrics, illustrative benchmark models, online tutorial code, a community discussion board, and public leaderboard for submissions. To promote accessibility and participation the competition features a Standard track that emphasises algorithmic innovation, as well as an Extended track that is expected to reward larger-scale computing, accelerating progress toward a non-invasive brain-computer interface for speech.