Olmo 3

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-source large language models (LLMs) exhibit limitations in long-context reasoning, structured function calling, and multi-task generalization. Method: Olmo 3 introduces the first fully open, end-to-end reproducible 7B/32B LLM family, employing high-quality multi-task pretraining, Reinforcement Reasoning Alignment (RRA), explicit function-call modeling, and long-context optimization. Crucially, it fully releases all training data, checkpoints, sampled sequences, and software dependencies. Contribution/Results: We release Olmo 3 Think 32B—the strongest fully open “reasoning-first” model to date. It achieves state-of-the-art performance among open models on reasoning (GSM8K, MMLU), coding (HumanEval, MBPP), and instruction-following (AlpacaEval 2.0), and significantly outperforms closed-source baselines including Llama 3-70B. By enabling full transparency and reproducibility, Olmo 3 advances auditable, trustworthy, and scientifically rigorous LLM research.

Technology Category

Application Category

📝 Abstract
We introduce Olmo 3, a family of state-of-the-art, fully-open language models at the 7B and 32B parameter scales. Olmo 3 model construction targets long-context reasoning, function calling, coding, instruction following, general chat, and knowledge recall. This release includes the entire model flow, i.e., the full lifecycle of the family of models, including every stage, checkpoint, data point, and dependency used to build it. Our flagship model, Olmo 3 Think 32B, is the strongest fully-open thinking model released to-date.
Problem

Research questions and friction points this paper is trying to address.

Develops open language models for long-context reasoning
Enables function calling, coding, and instruction following
Provides full lifecycle transparency in model construction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully-open 7B and 32B parameter language models
Targets long-context reasoning and function calling
Includes entire model lifecycle and all dependencies
🔎 Similar Papers
No similar papers found.
Allyson Ettinger
Allyson Ettinger
University of Chicago
Amanda Bertsch
Amanda Bertsch
PhD student, Language Technologies Institute, Carnegie Mellon University
summarizationlong-context NLUconditional generationNLP
Bailey Kuehl
Bailey Kuehl
Allen Institute for AI
D
David Graham
Allen Institute for AI
D
David Heineman
Allen Institute for AI
Dirk Groeneveld
Dirk Groeneveld
Allen Institute for Artificial Intelligence
natural language processingneural networksdeep learning
Faeze Brahman
Faeze Brahman
Research Scientist; Allen Institute for AI (Ai2)
Natural Language ProcessingMachine LearningAI AlignmentHuman-Centered AI
F
Finbarr Timbers
Allen Institute for AI
Hamish Ivison
Hamish Ivison
University of Washington
Natural Language Processing
Jacob Morrison
Jacob Morrison
Allen Institute for AI
natural language processing
J
Jake Poznanski
Allen Institute for AI
Kyle Lo
Kyle Lo
Allen Institute for AI
natural language processingmachine learninghuman computer interactionstatistics
Luca Soldaini
Luca Soldaini
Allen Institute for AI
Large Language ModelsOpen Source AIInformation Retrieval
Matt Jordan
Matt Jordan
Graduate Research Assistant, UT Austin
Adversarial Examples
Mayee Chen
Mayee Chen
Stanford University
Machine LearningComputer Science
Michael Noukhovitch
Michael Noukhovitch
Mila, Universite de Montreal
deep learningmultiagent reinforcement learningnatural language processing
Nathan Lambert
Nathan Lambert
Research Scientist, Allen AI
Reinforcement LearningMachine LearningRoboticsResponsible AI
P
Pete Walsh
Allen Institute for AI
Pradeep Dasigi
Pradeep Dasigi
Allen Institute for AI (Ai2)
Natural Language ProcessingMachine LearningLanguage Modeling
R
Robert Berry
Allen Institute for AI
Saumya Malik
Saumya Malik
Allen Institute for AI
S
Saurabh Shah
Allen Institute for AI
S
Scott Geng
Allen Institute for AI