Laguna M.1/XS.2 Technical Report

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance and efficiency bottlenecks in long-horizon, agent-driven code generation for complex software engineering tasks by proposing and implementing a highly integrated Model Factory architecture that transforms large language model development into a standardized industrial pipeline. Leveraging this framework, we trained from scratch two sparsely activated Mixture-of-Experts foundation models—Laguna M.1 and XS.2—using multilingual code datasets, quantization techniques, and comprehensive pretraining and post-training procedures. The resulting models achieve state-of-the-art performance among open-source systems on multiple benchmarks, including SWE-bench Verified, Multilingual, Pro, and Terminal-Bench 2.0. Notably, Laguna XS.2 has been released under the Apache 2.0 license.
📝 Abstract
We present Laguna M.1 and Laguna XS.2, two Mixture-of-Experts foundation models built for long-horizon, agentic coding: M.1 has $225.8$B total parameters ($23.4$B activated per token) and XS.2 has $33.4$B total ($3$B activated). Both models were trained from scratch end-to-end inside the same internal system that we refer to as our Model Factory: a tightly-integrated stack of versioned data, training, evaluation, and inference components that turn model development into an industrial process. We describe the principles and design choices of the Model Factory and also detail the end-to-end training process of our models, throughout pre-training data and architecture, post-training stages, evaluation, and quantization. On agentic software engineering and terminal benchmarks (SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro, and Terminal-Bench 2.0) M.1 and XS.2 are competitive with state-of-the-art open models in their respective weight classes. Laguna XS.2 weights are released under Apache~2.0 at https://huggingface.co/collections/poolside/laguna-xs2.
Problem

Research questions and friction points this paper is trying to address.

agentic coding
long-horizon tasks
foundation models
software engineering automation
Mixture-of-Experts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts
agentic coding
Model Factory
long-horizon reasoning
foundation model
🔎 Similar Papers
No similar papers found.