OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery

📅 2025-03-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses scientific reasoning and discovery tasks by introducing SciLLM, a large language model specialized for general scientific domains. Methodologically, it adopts a three-stage paradigm: (1) scientific literature–driven domain-adaptive pretraining, (2) scientific instruction fine-tuning, and (3) chain-of-thought–guided reasoning-oriented knowledge distillation. Notably, it is the first to deeply couple reasoning distillation with domain-adaptive pretraining, substantially enhancing logical consistency and domain-context modeling capability. SciLLM achieves state-of-the-art performance on GPQA Diamond and a battery materials science benchmark, outperforming all publicly available models of comparable parameter count. Ablation studies confirm that the two core modules contribute over 42% of the overall performance gain. Furthermore, end-to-end closed-loop validation is demonstrated on real-world scientific tasks—e.g., ranking electrolyte solvent molecules—confirming practical efficacy in scientific discovery workflows.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable potential in advancing scientific knowledge and addressing complex challenges. In this work, we introduce OmniScience, a specialized large reasoning model for general science, developed through three key components: (1) domain adaptive pretraining on a carefully curated corpus of scientific literature, (2) instruction tuning on a specialized dataset to guide the model in following domain-specific tasks, and (3) reasoning-based knowledge distillation through fine-tuning to significantly enhance its ability to generate contextually relevant and logically sound responses. We demonstrate the versatility of OmniScience by developing a battery agent that efficiently ranks molecules as potential electrolyte solvents or additives. Comprehensive evaluations reveal that OmniScience is competitive with state-of-the-art large reasoning models on the GPQA Diamond and domain-specific battery benchmarks, while outperforming all public reasoning and non-reasoning models with similar parameter counts. We further demonstrate via ablation experiments that domain adaptive pretraining and reasoning-based knowledge distillation are critical to attain our performance levels, across benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Develops a specialized LLM for scientific reasoning and discovery
Enhances model's ability for contextually relevant scientific responses
Outperforms public models in domain-specific benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain adaptive pretraining on scientific literature
Instruction tuning for domain-specific tasks
Reasoning-based knowledge distillation for logical responses
🔎 Similar Papers
No similar papers found.