Multimodal Crystal Flow: Any-to-Any Modality Generation for Unified Crystal Modeling

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing crystal generation methods are typically task-specific and lack a unified framework capable of handling multimodal tasks such as structure prediction and de novo generation. This work proposes the Multimodal Crystal Flow model (MCFlow), which, for the first time, unifies diverse crystal generation tasks within a shared continuous normalizing flow framework. By introducing independent time variables for atomic species and crystal structures, MCFlow enables flexible generation across arbitrary modalities. The approach integrates composition- and symmetry-aware atom ordering, hierarchical permutation augmentation, and a standard Transformer architecture to effectively embed physical priors without relying on explicit templates. Evaluated on the MP-20 and MPTS-52 benchmarks, MCFlow achieves or surpasses the performance of specialized models across multiple tasks.

Technology Category

Application Category

📝 Abstract

Crystal modeling spans a family of conditional and unconditional generation tasks across different modalities, including crystal structure prediction (CSP) and \emph{de novo} generation (DNG). While recent deep generative models have shown promising performance, they remain largely task-specific, lacking a unified framework that shares crystal representations across different generation tasks. To address this limitation, we propose \emph{Multimodal Crystal Flow (MCFlow)}, a unified multimodal flow model that realizes multiple crystal generation tasks as distinct inference trajectories via independent time variables for atom types and crystal structures. To enable multimodal flow in a standard transformer model, we introduce a composition- and symmetry-aware atom ordering with hierarchical permutation augmentation, injecting strong compositional and crystallographic priors without explicit structural templates. Experiments on the MP-20 and MPTS-52 benchmarks show that MCFlow achieves competitive performance against task-specific baselines across multiple crystal generation tasks.

Problem

Research questions and friction points this paper is trying to address.

crystal modeling

multimodal generation

unified framework

crystal structure prediction

de novo generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal generative modeling

crystal structure prediction

normalizing flow