About the job
As a Senior Data Scientist, you will play a critical role in a key area within our Foundation AI team: Engineering Efficiency and Code Intelligence. You will build the metrics, analytics, experimentation foundation, and AI workflow that powers how Roblox engineers and creators build and ship with AI and intelligent code systems.
Responsibilities
Develop Evaluation Frameworks: Design and operationalize rigorous evaluation systems for either GenAI features (text, image, video, 3D, 4D) or internal AI Agents (Code Review, Refactor, Test Gen). This includes eval experiment design, dataset design, label reliability analysis, and implementing and finetuning LLM-as-judge methods.
Run Rigorous Experiments: Conduct online experiments (A/B tests) and causal inference to quantify the impact of GenAI features or AI-assisted coding tools. You will identify opportunities, measure lift, and ensure statistical rigor.
Define Success Metrics: Partner with cross-functional teams to define leading/lagging indicators—whether for GenAI safety and user satisfaction, or for engineering productivity and code health.
Build Automated Systems: Research and apply state-of-the-art methodologies to build reproducible evaluation tooling and agentic workflows that lift rigor and efficiency across the company.
Drive Strategy & Visibility: Develop dashboards and reporting frameworks that reveal trends (e.g., model performance or developer friction) and translate complex data into clear, prioritized recommendations for leadership.
Qualifications
Minimum
Advanced Degree: PhD or Master’s in Statistics, Economics, Computer Science, Applied Math, Physics, Engineering, or a related quantitative field.
Experience: 5+ years of experience in data science, analytics, or a quantitative role.
Technical Proficiency: Strong proficiency in SQL (Hive/Spark) for manipulating large datasets and scripting languages (Python or R) for analysis and modeling.
Experimentation and Causal Inference: A solid grounding in experimentation, causal inference, and statistical analysis, including test design and metric design for feature impact.
Problem Solving: A demonstrated track record of framing ambiguous problems, designing analytical approaches, and solving open-ended data science problems that drive business impact.
Learning Agility: Ability to effectively and responsibly use AI tools to enhance productivity and a passion for continuously improving methods in a fast-evolving field.
Preferred
GenAI Familiarity: Familiarity with GenAI models and safety/quality evaluation methods. Expertise in the model training lifecycle is a plus (e.g., fine-tuning, RLHF, or synthetic data generation).
Engineering Development Workflow: Experience with engineering development workflows and engineering efficiency data is a plus for the Engineering Efficiency and Code Intelligence role.
Applied Research Background: A track record of applied research or publications in relevant technical fields is highly valued.