Atla Selene Mini: A General Purpose Evaluation Model

πŸ“… 2025-01-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the limited cross-domain generalization, expert agreement, and multi-task adaptability of existing small-language-model (SLM) evaluators, this paper introduces Atla Selene Miniβ€”a general-purpose 8B-parameter judgment model. Methodologically, we propose a novel data curation paradigm integrating synthetic critique data augmentation, multi-stage quality filtering, and prompt robustness optimization; further, we design a joint training framework combining Direct Preference Optimization (DPO) and Supervised Fine-Tuning (SFT). Our contributions are threefold: (1) Atla Selene Mini achieves state-of-the-art performance across 11 out-of-distribution benchmarks, surpassing GPT-4o-mini and comparable SLMs; (2) It ranks first among 8B-scale models on RewardBench; (3) It significantly improves zero-shot human-rating agreement in finance and healthcare domains and tops Judge Arena’s real-time evaluation leaderboard. The model is publicly released on Hugging Face and Ollama.

Technology Category

Application Category

πŸ“ Abstract
We introduce Atla Selene Mini, a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that outperforms the best SLMJs and GPT-4o-mini on overall performance across 11 out-of-distribution benchmarks, spanning absolute scoring, classification, and pairwise preference tasks. It is the highest-scoring 8B generative model on RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To achieve this, we develop a principled data curation strategy that augments public datasets with synthetically generated critiques and ensures high quality through filtering and dataset ablations. We train our model on a combined direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and produce a highly promptable evaluator that excels in real-world scenarios. Selene Mini shows dramatically improved zero-shot agreement with human expert evaluations on financial and medical industry datasets. It is also robust to variations in prompt format. Preliminary results indicate that Selene Mini is the top-ranking evaluator in a live, community-driven Judge Arena. We release the model weights on HuggingFace (https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage widespread community adoption.
Problem

Research questions and friction points this paper is trying to address.

Efficient Language Model
Expert-Level Consistency
Domain-Specific Evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Atla Selene Mini
RewardBench Competition
Customized Training Strategies
πŸ”Ž Similar Papers
No similar papers found.
Andrei Alexandru
Andrei Alexandru
University College London
A
Antonia Calvi
University College London
H
Henry Broomfield
University College London
J
Jackson Golden
University College London
K
Kyle Dai
University College London
M
Mathias Leys
University College London
M
Maurice Burger
University College London
Max Bartolo
Max Bartolo
Google DeepMind, UCL
NLPMachine LearningLLMsRobustness
R
Roman Engeler
University College London
S
Sashank Pisupati
University College London
T
Toby Drane
University College London
Y
Young Sun Park
University College London