Machine Learning Research Scientist, Post-Training

About the job

Scale works with the industry’s leading AI labs to provide high quality data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling). This role will focus on optimizing data curation and eval to enhance LLM capabilities in both text and multimodal modalities.

Responsibilities

Research and develop novel post-training techniques, including SFT, RLHF, and reward modeling, to enhance LLM core capabilities in both text and multimodal modalities.

Design and experiment new approaches to preference optimization.

Analyze model behavior, identify weaknesses, and propose solutions for bias mitigation and model robustness.

Publish research findings in top-tier AI conferences.

Qualifications

Minimum

No minimum qualifications listed.

Preferred

Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or a related field.

Deep understanding of deep learning, reinforcement learning, and large-scale model fine-tuning.

Experience with post-training techniques such as RLHF, preference modeling, or instruction tuning.

Excellent written and verbal communication skills

Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals

Previous experience in a customer facing role.