About the job
We are seeking an Applied Scientist to lead the development of evaluation frameworks and data collection protocols for robotic capabilities. In this role, you will focus on designing how we measure, stress-test, and improve robot behavior across a wide range of real-world tasks. Your work will play a critical role in shaping how policies are validated and how high-quality datasets are generated to accelerate system performance. You will operate at the intersection of robotics, machine learning, and human-in-the-loop systems, building the infrastructure and methodologies that connect teleoperation, evaluation, and learning. This includes developing evaluation policies, defining task structures, and contributing to operator-facing interfaces that enable scalable and reliable data collection. The ideal candidate is highly experimental, systems-oriented, and comfortable working across software, robotics, and data pipelines, with a strong focus on turning ambiguous capability goals into measurable and actionable evaluation systems.
Responsibilities
Design and implement evaluation frameworks to measure robot capabilities across structured tasks, edge cases, and real-world scenarios
Develop task definitions, success criteria, and benchmarking methodologies that enable consistent and reproducible evaluation of policies
Create and refine data collection protocols that generate high-quality, task-relevant datasets aligned with model development needs
Build and iterate on teleoperation workflows and operator interfaces to support efficient, reliable, and scalable data collection
Analyze evaluation results and collected data to identify performance gaps, failure modes, and opportunities for targeted data collection
Collaborate with engineering teams to integrate evaluation tooling, logging systems, and data pipelines into the broader robotics stack
Stay current with advances in robotics, evaluation methodologies, and human-in-the-loop learning to continuously improve internal approaches
Lead technical projects from conception through production deployment
Mentor junior scientists and engineers
Qualifications
Minimum
PhD, or Master's degree and 6+ years of applied research experience
3+ years of industry or academic research experience
Experience with any programming language such as Python, Java, C++
5+ years of building machine learning models or developing algorithms for business application experience
Experience leading technical initiatives and key deliverables
- Experience in patents or publication at top-tier conferences
- Demonstrated expertise in deep learning and model development
- Strong experience with robotics systems, control, or embodied AI
- Experience designing evaluation methodologies, benchmarks, or experimental frameworks for large-scale ML models or robotic systems
- Familiarity with teleoperation systems, simulation environments, or human-in-the-loop data collection
Preferred
Experience managing and deploying ML products
- Experience in patents or publications at top-tier peer-reviewed conferences or journals
- Experience leading research initiatives in robotics or foundation models