Intern-S1: A Scientific Multimodal Foundation Model

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the performance gap of open-source foundational models in high-value scientific domains—hindering paradigm shifts in scientific research—this work introduces a science-oriented multimodal Mixture-of-Experts (MoE) foundation model. The model features 28 billion activated parameters and 241 billion total parameters, pretrained on 5 trillion tokens—including over 2.5 trillion tokens of scientific text, structured data, and images. We propose the novel Mixture-of-Rewards (MoR) mechanism, enabling parallel reinforcement learning across 1,000+ scientific tasks. Through synergistic offline and online RL optimization, the model achieves state-of-the-art performance among *open-source* models—and surpasses proprietary SOTA baselines—on key tasks including molecular synthesis planning, reaction condition prediction, and crystal thermodynamic stability prediction. Its integrated scientific reasoning capability represents the new open-source frontier in foundational AI for science.

Technology Category

Application Category

📝 Abstract
In recent years, a plethora of open-source foundation models have emerged, achieving remarkable progress in some widely attended fields, with performance being quite close to that of closed-source models. However, in high-value but more challenging scientific professional fields, either the fields still rely on expert models, or the progress of general foundation models lags significantly compared to those in popular areas, far from sufficient for transforming scientific research and leaving substantial gap between open-source models and closed-source models in these scientific domains. To mitigate this gap and explore a step further toward Artificial General Intelligence (AGI), we introduce Intern-S1, a specialized generalist equipped with general understanding and reasoning capabilities with expertise to analyze multiple science modal data. Intern-S1 is a multimodal Mixture-of-Experts (MoE) model with 28 billion activated parameters and 241 billion total parameters, continually pre-trained on 5T tokens, including over 2.5T tokens from scientific domains. In the post-training stage, Intern-S1 undergoes offline and then online reinforcement learning (RL) in InternBootCamp, where we propose Mixture-of-Rewards (MoR) to synergize the RL training on more than 1000 tasks simultaneously. Through integrated innovations in algorithms, data, and training systems, Intern-S1 achieved top-tier performance in online RL training.On comprehensive evaluation benchmarks, Intern-S1 demonstrates competitive performance on general reasoning tasks among open-source models and significantly outperforms open-source models in scientific domains, surpassing closed-source state-of-the-art models in professional tasks, such as molecular synthesis planning, reaction condition prediction, predicting thermodynamic stabilities for crystals. Our models are available at https://huggingface.co/internlm/Intern-S1.
Problem

Research questions and friction points this paper is trying to address.

Addressing performance gap between open-source and closed-source scientific foundation models
Developing multimodal AI with expert capabilities for scientific data analysis
Advancing AGI through specialized scientific reasoning and understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Mixture-of-Experts architecture with 241B parameters
Continuous pretraining on 5T tokens including scientific data
Offline and online reinforcement learning with Mixture-of-Rewards
🔎 Similar Papers
No similar papers found.
Lei Bai
Lei Bai
Shanghai AI Laboratory
Foundation ModelScience IntelligenceMulti-Agent SystemAutonomous Discovery
Z
Zhongrui Cai
Intern-S1 Team, Shanghai AI Laboratory
Maosong Cao
Maosong Cao
Shanghai AI Lab
CVNLP
W
Weihan Cao
Intern-S1 Team, Shanghai AI Laboratory
C
Chiyu Chen
Intern-S1 Team, Shanghai AI Laboratory
H
Haojiong Chen
Intern-S1 Team, Shanghai AI Laboratory
K
Kai Chen
Intern-S1 Team, Shanghai AI Laboratory
P
Pengcheng Chen
Intern-S1 Team, Shanghai AI Laboratory
Y
Ying Chen
Intern-S1 Team, Shanghai AI Laboratory
Y
Yongkang Chen
Intern-S1 Team, Shanghai AI Laboratory
Y
Yu Cheng
Intern-S1 Team, Shanghai AI Laboratory
Y
Yu Cheng
Intern-S1 Team, Shanghai AI Laboratory
P
Pei Chu
Intern-S1 Team, Shanghai AI Laboratory
Tao Chu
Tao Chu
SCUT
Erfei Cui
Erfei Cui
Shanghai AI Laboratory; Shanghai JiaoTong University
Computer Vision
Ganqu Cui
Ganqu Cui
Shanghai AI Lab
LLM AlignmentReinforcement Learning
L
Long Cui
Intern-S1 Team, Shanghai AI Laboratory
Ziyun Cui
Ziyun Cui
Tsinghua University
Nianchen Deng
Nianchen Deng
Shanghai AI Laboratory
CGARVR
N
Ning Ding
Intern-S1 Team, Shanghai AI Laboratory
N
Nanqin Dong
Intern-S1 Team, Shanghai AI Laboratory
Peijie Dong
Peijie Dong
Ph.D. Candidate in HKUST(GZ)
LLM CompressionEfficient MLLLM PruningLLM Quantization
Shihan Dou
Shihan Dou
Fudan University
LLMsCode LMsRLAlignment
S
Sinan Du
Intern-S1 Team, Shanghai AI Laboratory
Haodong Duan
Haodong Duan
Shanghai AI Lab | CUHK | PKU
Computer VisionVideo UnderstandingMultimodal LearningGenerative AI