🤖 AI Summary
Current high-performance large language models (LLMs) are predominantly closed-source, limiting transparency, reproducibility, and community scrutiny.
Method: This work introduces Instella—a fully open-source, reproducible LLM family—trained end-to-end using publicly available data and code, spanning pretraining, instruction fine-tuning, and human preference alignment. Leveraging AMD Instinct MI300X GPUs, it integrates supervised fine-tuning and reinforcement learning to enhance instruction following and complex reasoning. Crucially, it achieves state-of-the-art performance for a 3B-parameter model with significantly fewer pretraining tokens.
Contribution/Results: Instella sets a new benchmark among open-weight LLMs, delivering superior performance across standard evaluations. It includes specialized variants supporting 128K-context windows and math reasoning optimization. By providing complete openness—from data and training scripts to checkpoints—the project establishes a transparent, reproducible paradigm for LLM research and development.
📝 Abstract
Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, yet the majority of high-performing models remain closed-source or partially open, limiting transparency and reproducibility. In this work, we introduce Instella, a family of fully open three billion parameter language models trained entirely on openly available data and codebase. Powered by AMD Instinct MI300X GPUs, Instella is developed through large-scale pre-training, general-purpose instruction tuning, and alignment with human preferences. Despite using substantially fewer pre-training tokens than many contemporaries, Instella achieves state-of-the-art results among fully open models and is competitive with leading open-weight models of comparable size. We further release two specialized variants: Instella-Long, capable of handling context lengths up to 128K tokens, and Instella-Math, a reasoning-focused model enhanced through supervised fine-tuning and reinforcement learning on mathematical tasks. Together, these contributions establish Instella as a transparent, performant, and versatile alternative for the community, advancing the goal of open and reproducible language modeling research.