AI Research Scientist - MSL FAIR Foundations

About the job

Meta is seeking Research Scientists to join the Evaluations team within Meta Superintelligence Labs (MSL). Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. As a Research Scientist, you will provide the technical capabilities to measure and understand the capabilities of our frontier AI systems. You'll work in tandem with world-class researchers to envision, develop, and validate novel evaluations that shape the future of AI capability measurement.

Responsibilities

Design novel benchmarks and evaluation methodologies for frontier AI capabilities

Contribute to evaluation frameworks that guide research direction and capability development across MSL

Support the scientific vision for evaluation approaches in emerging modalities and novel model capabilities

Partner with cross-functional research teams across product and model training to identify and prioritize gaps in capability through rigorous evaluation

Work on research workstreams that shape the long-term direction of evaluation science at MSL, working independently while also contributing to team goals and organizational priorities

Qualifications

Minimum

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience

Ph.D. in Computer Science, Machine Learning, or a related technical field

3+ years of experience in machine learning research, with a focus on evaluation, deep learning, or related areas

Demonstrated ability to execute on technical research projects from conception to production

Effective communication skills and experience collaborating with technical leadership

Preferred

Multiple first-author publications at top-tier peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, benchmarking, or deep learning

Recognized expertise in machine learning evaluation, benchmarking, or capability measurement

Track record of research that has substantially influenced the field of deep learning

Hands-on experience with language model post-training, RLHF, or related techniques