Machine Learning Research Engineer, GenAI Applied ML

Scale AI
San Francisco / New York / Seattle2024-11-13

About the job

Lead applied ML engineering on Scale's Applied ML team, powering data infrastructure for leading agentic LLMs (ChatGPT, Gemini, Llama). You will build scalable multi-agent systems to validate agentic reasoning and behaviors, scale human expertise, and drive research into real-world agent reliability failures despite strong benchmarks, shipping production fixes.

Responsibilities

Build and deploy multi-agent systems for agentic reasoning validation

Develop pipelines to detect errors and scale human judgment

Combine classical ML, LLMs, and multi-agent techniques for reliability

Lead research into agent failure modes and ship fixes

Use AI tools to speed prototyping and iteration

Build data-driven evaluations and deploy rapid improvements

Integrate systems into Scale's platform

Qualifications

Minimum

PhD or MSc in Computer Science, Mathematics, Statistics, or related field

3+ years shipping scaled production ML systems

Demonstrated real-world impact

Mastery of PyTorch, TensorFlow, JAX, or scikit-learn

Deep expertise in agentic LLMs and multi-agent systems

Strong software engineering and microservices (AWS/GCP)

Rapid, data-driven iteration

Proficiency using AI tools to accelerate work

Strong research depth with practical bias

Excellent cross-functional communication

Preferred

Experience prototyping agent evaluation/reliability systems

Human-in-the-loop or annotation pipeline work

Open-source contributions in agents, evaluation, or alignment

Publications on agent reliability (NeurIPS, ICML, ICLR)