Bypassing Direct Reconstruction: Speech Detection from MEG via Large-Scale Audio Retrieval

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Accurately decoding speech from non-invasive magnetoencephalography (MEG) signals—particularly distinguishing between speech and silence intervals—remains a significant challenge. This work proposes a novel two-stage framework that first employs contrastive learning to retrieve semantically aligned audio segments from a large-scale speech corpus (LibriVox) based on MEG inputs, and then generates a binary speech/silence sequence from the retrieved segment. By circumventing conventional end-to-end speech reconstruction and introducing external large-scale audio retrieval into neural decoding for the first time, the method achieves substantial performance gains. It secured first place in the LibriBrain 2025 Speech Detection Challenge’s extended track with an F1-score of 0.962.

📝 Abstract

Decoding speech from non-invasive brain signals is challenging. For the LibriBrain 2025 Speech Detection task, we propose a novel two-step framework that bypasses direct reconstruction. First, a contrastive learning model retrieves the matching speech segment for the given test MEG from a large-scale audio library (LibriVox). Second, a speech detection model generates the binary silence/speech sequence directly from this retrieved audio. With this approach, our team Sherlock Holmes achieved first place in the extended track (F1-score: 0.962), demonstrating that leveraging external audio databases is a highly effective strategy.

Problem

Research questions and friction points this paper is trying to address.

speech detection

MEG

brain decoding

non-invasive brain signals

audio retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

contrastive learning

audio retrieval

MEG decoding

speech detection

two-step framework

🔎 Similar Papers

No similar papers found.