About the job
Meta is seeking Research Scientists to join the Evaluations team within Meta Superintelligence Labs (MSL). Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. As a Research Scientist, you will provide the technical capabilities to measure and understand the capabilities of our frontier AI systems. You'll work in tandem with world-class researchers to envision, develop, and validate novel evaluations that shape the future of AI capability measurement.
Responsibilities
Design novel benchmarks and evaluation methodologies for frontier AI capabilities
Contribute to evaluation frameworks that guide research direction and capability development across MSL
Support the scientific vision for evaluation approaches in emerging modalities and novel model capabilities
Partner with cross-functional research teams across product and model training to identify and prioritize gaps in capability through rigorous evaluation
Work on research workstreams that shape the long-term direction of evaluation science at MSL, working independently while also contributing to team goals and organizational priorities
Qualifications
Minimum
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
Ph.D. in Computer Science, Machine Learning, or a related technical field
3+ years of experience in machine learning research, with a focus on evaluation, deep learning, or related areas
Demonstrated ability to execute on technical research projects from conception to production
Effective communication skills and experience collaborating with technical leadership
Preferred
Multiple first-author publications at top-tier peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, benchmarking, or deep learning
Recognized expertise in machine learning evaluation, benchmarking, or capability measurement
Track record of research that has substantially influenced the field of deep learning
Hands-on experience with language model post-training, RLHF, or related techniques