About the job
Nuro takes a machine-learning-first approach to autonomous driving technology. In an ML-first system, the overall system performance depends heavily on the quantity and diversity of its training and evaluation data. The team plays a crucial role in the advancement of autonomous driving systems by creating a scalable and reliable data infrastructure. This infrastructure is designed to produce training and evaluation data derived from both on-road collected logs and simulation logs. Additionally, the team collaborates closely with system engineers to thoroughly validate the autonomous driving system before its deployment.
Responsibilities
Design and develop unified, introspectable, large-scale batch and streaming data pipelines that can ingest and process data across a wide range of use cases relevant to evaluation.
Create and implement a storage system capable of accommodating both the large volume and diverse range of evaluation and performance metrics.
Construct intuitive dashboards and reports to present evaluation results, facilitating straightforward comparisons that highlight both improvements and regressions of the ML components and the overall system.
Develop and maintain continuous testing and monitoring systems to guarantee the integrity and resilience of our data and associated data pipelines.
Develop data mining tools with applied ML techniques to support data discovery needs from Autonomy including Perception, Behavior, and Mapping
Develop data annotation tools to support first-party and third-party labeling workforce to provide high fidelity perception, mapping, and driving trajectory labels
Scale data annotation labels with applied State-of-the-art ML techniques
Qualifications
Minimum
You have a degree in BS, MS.c or Ph.D, plus 1+ years of relevant work experience
Strong proficiency in Python or similar languages
Domain experience: Experience working with large-scale data and building scalable & reliable systems/data pipelines; ability to understand and design complex systems
Technical excellence: Ability and willingness to deep dive into implementation, driving technical standards and best practices across broader software organization
A bachelor's degree in Computer Science, Electrical Engineering, or a closely related field
Preferred
Strong proficiency in C++ or other high-performance low-level languages
Strong knowledge of GCP, GCS, BigQuery, or PostgreSQL
Knowledge of data engineering, and its tooling and best practices
Knowledge of batch and streaming data processing, warehousing, and analytics solutions
Experience working with large-scale distributed data systems
Experience with system & framework design
Experience with data workflow orchestration platforms