UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion

📅 2025-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Joint generation of sequences and 3D structures for scientific data (e.g., materials, molecules, proteins) remains challenging: autoregressive models suffer from limited accuracy, while diffusion models struggle with discrete sequence modeling. Method: We propose the first unified autoregressive–conditional diffusion framework for co-generating sequences and structures. An autoregressive module guides diffusion training, while the diffusion module reciprocally refines sequence prediction; combined with VQ-VAE-based discrete tokenization and multimodal alignment, the framework ensures stable long-range structural generation. Contribution/Results: Our method achieves significant improvements over state-of-the-art (SOTA) in crystal structure prediction. It also establishes new SOTA performance in de novo small-molecule design, conditional generation, and long-sequence structural modeling—demonstrating substantial gains in both accuracy and generation stability.

Technology Category

Application Category

📝 Abstract
Unified generation of sequence and structure for scientific data (e.g., materials, molecules, proteins) is a critical task. Existing approaches primarily rely on either autoregressive sequence models or diffusion models, each offering distinct advantages and facing notable limitations. Autoregressive models, such as GPT, Llama, and Phi-4, have demonstrated remarkable success in natural language generation and have been extended to multimodal tasks (e.g., image, video, and audio) using advanced encoders like VQ-VAE to represent complex modalities as discrete sequences. However, their direct application to scientific domains is challenging due to the high precision requirements and the diverse nature of scientific data. On the other hand, diffusion models excel at generating high-dimensional scientific data, such as protein, molecule, and material structures, with remarkable accuracy. Yet, their inability to effectively model sequences limits their potential as general-purpose multimodal foundation models. To address these challenges, we propose UniGenX, a unified framework that combines autoregressive next-token prediction with conditional diffusion models. This integration leverages the strengths of autoregressive models to ease the training of conditional diffusion models, while diffusion-based generative heads enhance the precision of autoregressive predictions. We validate the effectiveness of UniGenX on material and small molecule generation tasks, achieving a significant leap in state-of-the-art performance for material crystal structure prediction and establishing new state-of-the-art results for small molecule structure prediction, de novo design, and conditional generation. Notably, UniGenX demonstrates significant improvements, especially in handling long sequences for complex structures, showcasing its efficacy as a versatile tool for scientific data generation.
Problem

Research questions and friction points this paper is trying to address.

Unified generation of sequence and structure for scientific data.
Integration of autoregressive and diffusion models for enhanced precision.
Improvement in material and molecule generation tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines autoregressive models with diffusion models
Enhances precision in scientific data generation
Achieves state-of-the-art in material and molecule prediction
🔎 Similar Papers
No similar papers found.
Gongbo Zhang
Gongbo Zhang
School of Electronic and Computer Engineering, Peking University
AI for ScienceMachine LearningGenerative Model
Y
Yanting Li
DSA, The Hong Kong University of Science and Technology (Guangzhou)
Renqian Luo
Renqian Luo
Senior Researcher, Microsoft Research
Artificial IntelligenceMachine LearningDeep Learning
Pipi Hu
Pipi Hu
Senior researcher, Microsoft Research AI4Science
Differential equation related neural networks
Z
Zeru Zhao
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology
L
Lingbo Li
Department of Automation, Tsinghua University
Guoqing Liu
Guoqing Liu
Microsoft Research AI for Science
Artificial IntelligenceReinforcement LearningLarge Language ModelsAI for Science
Z
Zun Wang
Microsoft Research AI for Science
R
Ran Bi
Microsoft Research AI for Science
Kaiyuan Gao
Kaiyuan Gao
Huazhong University of Science and Technology
Visual GenerationAI4Science
Liya Guo
Liya Guo
Tsinghua University
Y
Yu Xie
Microsoft Research AI for Science
C
Chang Liu
Microsoft Research AI for Science
J
Jia Zhang
Microsoft Research AI for Science
T
Tian Xie
Microsoft Research AI for Science
Robert Pinsler
Robert Pinsler
Microsoft Research
AI for ScienceMachine Learning
Claudio Zeni
Claudio Zeni
Senior Researcher @ Microsoft
machine learningmolecular dynamicsnanoparticlesdiffusion modelscrystal generation
Z
Ziheng Lu
Microsoft Research AI for Science
Yingce Xia
Yingce Xia
Unknown affiliation
Large Language ModelMachine LearningDrug Discovery
Marwin Segler
Marwin Segler
Microsoft Research AI for Science
Machine LearningMedicinal ChemistryOrganic ChemistryReinforcement LearningChemoinformatics
M
Maik Riechert
Microsoft Research AI for Science
Li Yuan
Li Yuan
Research Associate, University of Science & Technology of China (USTC)
Antibiotic resistanceWastewater treatmentEnvironmental bioremediationAnaerobic digestionFate of organic pollutants
L
Lei Chen
DSA, The Hong Kong University of Science and Technology (Guangzhou)
Haiguang Liu
Haiguang Liu
Zhongguancun Academy
ai4sciencebiophysicsstructure biologyx-ray laserserial crystallography
Tao Qin
Tao Qin
Vice President, Zhongguancun Academy
Deep LearningAI4ScienceSpeech SynthesisNeural Machine TranslationInformation Retrieval