Fast Spectrogram Event Extraction via Offline Self-Supervised Learning: From Fusion Diagnostics to Bioacoustics

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the challenge of efficiently extracting coherent, quasi-coherent, and transient modes from petabyte-scale, high-noise time-frequency signals generated daily by next-generation fusion devices such as ITER, where traditional manual analysis proves inadequate. The authors propose a novel “signal-first” self-supervised framework that uniquely integrates offline self-supervised learning with nonlinear optimal multichannel signal processing and incorporates a fast neural network surrogate model. This approach automatically identifies key plasma events from multimodal diagnostic data—including magnetic probes, electron cyclotron emission, CO₂ interferometry, and beam emission spectroscopy—with a demonstrated inference latency of only 0.5 seconds. Validated on DIII-D, TJ-II, and non-fusion spectrogram datasets, the method exhibits strong cross-domain generalizability, enabling real-time plasma control and large-scale automated data analysis.

Technology Category

Application Category

📝 Abstract

Next-generation fusion facilities like ITER face a"data deluge,"generating petabytes of multi-diagnostic signals daily that challenge manual analysis. We present a"signals-first"self-supervised framework for the automated extraction of coherent and transient modes from high-noise time-frequency data. We also develop a general-purpose method and tool for extracting coherent, quasi-coherent, and transient modes for fluctuation measurements in tokamaks by employing non-linear optimal techniques in multichannel signal processing with a fast neural network surrogate on fast magnetics, electron cyclotron emission, CO2 interferometers, and beam emission spectroscopy measurements from DIII-D. Results are tested on data from DIII-D, TJ-II, and non-fusion spectrograms. With an inference latency of 0.5 seconds, this framework enables real-time mode identification and large-scale automated database generation for advanced plasma control. Repository is in https://github.com/PlasmaControl/TokEye.

Problem

Research questions and friction points this paper is trying to address.

data deluge

coherent modes

transient modes

time-frequency data

automated extraction

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning

time-frequency analysis

multichannel signal processing