[2026] Data Scientist, Foundation AI - PhD Early Career

Roblox
San Mateo, CA, USA2026-02-10

About the job

As a Data Scientist, you will play a critical role in evaluation and optimization for user-facing GenAI systems (such as text, image, video, 3D, 4D). You will define how we measure safety, responsibility, quality, and efficiency. You will combine annotation analysis, design of experiments, causal inference, model-based evaluation methods (such as LLM-as-a-judge), optimization algorithm, and AI models to drive product decisions and model improvements.

Responsibilities

Develop Evaluation Frameworks: Design and operationalize rigorous evaluation systems for either GenAI features (text, image, video, 3D, 4D). This includes eval experiment design, dataset design, label reliability analysis, and implementing and finetuning LLM-as-judge methods.

Run Rigorous Experiments: Conduct online experiments (A/B tests) and causal inference to quantify the impact of GenAI features. You will identify opportunities, measure lift, and ensure statistical rigor.

Define Success Metrics: Partner with cross-functional teams to define leading/lagging indicators for GenAI feature user satisfaction, business success, and safety.

Build Automated Systems: Research and apply state-of-the-art methodologies to build reproducible evaluation tooling that lift rigor and efficiency across the company.

Conduct Applied Research at the Frontier: Maintain an active pulse on the intersection of Gen AI and Data Science. You will innovate on methodology and techniques to solve unique business challenges while contributing to the broader field in the technical community.

Qualifications

Minimum

Possess or pursuing a PhD or equivalent in Statistics, Economics, Computer Science, Applied Math, Physics, Engineering, or a related quantitative field.

Technical Proficiency: Strong proficiency in SQL (Hive/Spark) for manipulating large datasets and scripting languages (Python or R) for analysis and modeling.

Experimentation and Causal Inference: A solid grounding in experimentation, causal inference, and statistical analysis, including test design and metric design for feature impact.

Problem Solving: A demonstrated track record of framing ambiguous problems, designing analytical approaches, and solving open-ended data science problems that drive business impact.

Learning Agility: Ability to effectively and responsibly use AI tools to enhance productivity and a passion for continuously improving methods in a fast-evolving field.

Preferred

GenAI Familiarity: Familiarity with GenAI models and safety/quality evaluation methods. Expertise in the model training lifecycle is a plus (e.g., fine-tuning, RLHF, or synthetic data generation).

Applied Research Background: A track record of applied research or publications in relevant technical fields is highly valued.