TokaMind: A Multi-Modal Transformer Foundation Model for Tokamak Plasma Dynamics

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes TokaMind, a multimodal Transformer-based pre-trained foundation model designed to address key challenges in tokamak plasma modeling—namely, the heterogeneity of multimodal diagnostic data, inconsistent sampling rates, and missing signals. TokaMind introduces a novel training-free DCT3D embedding method that enables unified representation of diverse data modalities, including time series, 2D profiles, and video streams. The architecture features plug-and-play embedding interfaces and component-wise selective loading, facilitating efficient fine-tuning and transfer learning. Integrated with VAE-based surrogate embeddings and robust mechanisms for handling missing signals, TokaMind significantly outperforms baseline models on the TokaMark benchmark using the MAST dataset. Notably, lightweight fine-tuning of the pre-trained model surpasses performance achieved by training from scratch, demonstrating the efficacy and generalization capability of multimodal pre-training in fusion plasma diagnostics.

Technology Category

Application Category

📝 Abstract
We present TokaMind, an open-source foundation model framework for fusion plasma modeling, based on a Multi-Modal Transformer (MMT) and trained on heterogeneous tokamak diagnostics from the publicly available MAST dataset. TokaMind supports multiple data modalities (time-series, 2D profiles, and videos) with different sampling rates, robust missing-signal handling, and efficient task adaptation via selectively loading and freezing four model components. To represent multi-modal signals, we use a training-free Discrete Cosine Transform embedding (DCT3D) and provide a clean interface for alternative embeddings (e.g., Variational Autoencoders - VAEs). We evaluate TokaMind on the recently introduced MAST benchmark TokaMark, comparing training and embedding strategies. Our results show that fine-tuned TokaMind outperforms the benchmark baseline on all but one task, and that, for several tasks, lightweight fine-tuning yields better performance than training the same architecture from scratch under a matched epoch budget. These findings highlight the benefits of multi-modal pretraining for tokamak plasma dynamics and provide a practical, extensible foundation for future fusion modeling tasks. Training code and model weights will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

tokamak
plasma dynamics
multi-modal data
fusion modeling
heterogeneous diagnostics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Modal Transformer
DCT3D embedding
tokamak plasma dynamics
foundation model
heterogeneous diagnostics
🔎 Similar Papers
No similar papers found.
Tobia Boschi
Tobia Boschi
IBM Research Europe
Statistics
A
Andrea Loreti
UK Atomic Energy Authority
N
Nicola C. Amorisco
UK Atomic Energy Authority
R
Rodrigo H. Ordonez-Hurtado
IBM Research Europe
C
Cécile Rousseau
IBM Research Europe
G
George K. Holt
STFC Hartree Centre
E
Eszter Székely
UK Atomic Energy Authority
A
Alexander Whittle
UK Atomic Energy Authority
S
Samuel Jackson
UK Atomic Energy Authority
A
Adriano Agnello
STFC Hartree Centre
Stanislas Pamela
Stanislas Pamela
CCFE - UKAEA - UK
Plasma PhysicsFusion
A
Alessandra Pascale
IBM Research Europe
Robert Akers
Robert Akers
Director of Computing Programmes & Senior Fellow, UKAEA
Plasma PhysicsHigh Energy PhysicsHPCAIEngineering Simulation
J
Juan Bernabe Moreno
IBM Research Europe
Vassil Alexandrov
Vassil Alexandrov
Hartee Centre - STFC, UK
Computational scienceparallel algorithmsMonte Carlo methodsScalable AlgorithmsAir Pollution Modeling
Mykhaylo Zayats
Mykhaylo Zayats
IBM Research
Applied MathematicsScientific ComputingAI for Science