About the job
You will be a vital member of our ML Data Team – which leads the full spectrum of video-language data preparation and model evaluation. This role comes with high ownership and includes responsibilities such as defining dataset needs and requirements in consultation with our research and product teams; designing and building data pipelines; and driving our post-training model evaluation strategy. You will also be responsible for automating as much of the repetitive partnership, annotation, and quality evaluation work as possible. A desire to work cross functionally and to build relationships is critical for success in this position.
Responsibilities
- Model Evaluation: Design and build robust model evaluation frameworks, automating repetitive processes and maintaining a balanced approach to efficiency and depth in obtaining evaluation metrics and feedback.
- Portfolio Monitoring: Manage resource allocation and timelines, adjusting direction flexibly based on real-time information across all data streams in your product vertical.
- External Partner Collaboration: Enhance dataset and process quality through seamless collaboration with vendors and outsourcing partners.
- Data Quality & Tooling Advancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure to build a sustainable data operations framework.
- Internal Collaboration: Partner with Engineering and AI Model teams to align on top priority data needs, design tools such as analytical reports and dashboards, and clearly communicate project progress.
Qualifications
Minimum
- 5+ years of experience working in an AI focused data operations organization.
- A proven track record designing and executing large scale data or evaluation projects, including gathering, labeling, and post-processing data.
- The ability to analyze messy and complex data, identify overarching patterns, and distill your findings into crisp annotation guidelines or model quality reports.
- Proficiency with Python, LLMs, or other popular industry tools for automation.
- Excellent communication and project management skills, and the ability to support several projects simultaneously.
- A foundational understanding of and interest in LLMs/VLMs and multimodal AI.
- Conviction that data is the key ingredient for the performance and assessment of AI models.
Preferred
- Experience in data collection and labeling for multimodal language models.
- Experience in red teaming, localization testing, or other evaluation focused fields.
- Experience working with research scientists and engineers.
- Expertise or interest in video-centric domains, such as sports, advertising, and content creation.