Multimodal Crystal Flow: Any-to-Any Modality Generation for Unified Crystal Modeling

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing crystal generation methods are typically task-specific and lack a unified framework capable of handling multimodal tasks such as structure prediction and de novo generation. This work proposes the Multimodal Crystal Flow model (MCFlow), which, for the first time, unifies diverse crystal generation tasks within a shared continuous normalizing flow framework. By introducing independent time variables for atomic species and crystal structures, MCFlow enables flexible generation across arbitrary modalities. The approach integrates composition- and symmetry-aware atom ordering, hierarchical permutation augmentation, and a standard Transformer architecture to effectively embed physical priors without relying on explicit templates. Evaluated on the MP-20 and MPTS-52 benchmarks, MCFlow achieves or surpasses the performance of specialized models across multiple tasks.

Technology Category

Application Category

📝 Abstract
Crystal modeling spans a family of conditional and unconditional generation tasks across different modalities, including crystal structure prediction (CSP) and \emph{de novo} generation (DNG). While recent deep generative models have shown promising performance, they remain largely task-specific, lacking a unified framework that shares crystal representations across different generation tasks. To address this limitation, we propose \emph{Multimodal Crystal Flow (MCFlow)}, a unified multimodal flow model that realizes multiple crystal generation tasks as distinct inference trajectories via independent time variables for atom types and crystal structures. To enable multimodal flow in a standard transformer model, we introduce a composition- and symmetry-aware atom ordering with hierarchical permutation augmentation, injecting strong compositional and crystallographic priors without explicit structural templates. Experiments on the MP-20 and MPTS-52 benchmarks show that MCFlow achieves competitive performance against task-specific baselines across multiple crystal generation tasks.
Problem

Research questions and friction points this paper is trying to address.

crystal modeling
multimodal generation
unified framework
crystal structure prediction
de novo generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal generative modeling
crystal structure prediction
normalizing flow
symmetry-aware representation
unified crystal generation
Kiyoung Seong
Kiyoung Seong
M.Sc. student, KAIST
AI for Science
Sungsoo Ahn
Sungsoo Ahn
KAIST
Machine Learning
S
Sehui Han
Materials Intelligence Lab, LG AI Research, Seoul, South Korea
C
Changyoung Park
Materials Intelligence Lab, LG AI Research, Seoul, South Korea