Atla Selene Mini: A General Purpose Evaluation Model

📅 2025-01-27

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

To address the limited cross-domain generalization, expert agreement, and multi-task adaptability of existing small-language-model (SLM) evaluators, this paper introduces Atla Selene Mini—a general-purpose 8B-parameter judgment model. Methodologically, we propose a novel data curation paradigm integrating synthetic critique data augmentation, multi-stage quality filtering, and prompt robustness optimization; further, we design a joint training framework combining Direct Preference Optimization (DPO) and Supervised Fine-Tuning (SFT). Our contributions are threefold: (1) Atla Selene Mini achieves state-of-the-art performance across 11 out-of-distribution benchmarks, surpassing GPT-4o-mini and comparable SLMs; (2) It ranks first among 8B-scale models on RewardBench; (3) It significantly improves zero-shot human-rating agreement in finance and healthcare domains and tops Judge Arena’s real-time evaluation leaderboard. The model is publicly released on Hugging Face and Ollama.

Technology Category

Application Category

📝 Abstract

We introduce Atla Selene Mini, a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that outperforms the best SLMJs and GPT-4o-mini on overall performance across 11 out-of-distribution benchmarks, spanning absolute scoring, classification, and pairwise preference tasks. It is the highest-scoring 8B generative model on RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To achieve this, we develop a principled data curation strategy that augments public datasets with synthetically generated critiques and ensures high quality through filtering and dataset ablations. We train our model on a combined direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and produce a highly promptable evaluator that excels in real-world scenarios. Selene Mini shows dramatically improved zero-shot agreement with human expert evaluations on financial and medical industry datasets. It is also robust to variations in prompt format. Preliminary results indicate that Selene Mini is the top-ranking evaluator in a live, community-driven Judge Arena. We release the model weights on HuggingFace (https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage widespread community adoption.

Problem

Research questions and friction points this paper is trying to address.

Efficient Language Model

Expert-Level Consistency

Domain-Specific Evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Atla Selene Mini

RewardBench Competition

Customized Training Strategies

🔎 Similar Papers

No similar papers found.