Machine Learning Engineer, Offline Infrastructure (Entry-Level / New Grad)

About the job

Unity Vector builds an offline ML platform that powers insight, experimentation, attribution, and AI-driven decision-making across the company. Our systems operate at scale across batch and streaming data, supporting analytics, product intelligence, machine learning pipelines, and business operations. As data volume and complexity grow, our platform enables large-scale model training, feature generation, and experimentation workflows that power production ML systems. We’re looking for a Machine Learning Engineer to join our Offline Infrastructure team. This is an ideal role for a recent university graduate who is excited to work on large-scale systems and apply research-driven thinking to real-world machine learning problems. You’ll help build and evolve the infrastructure that powers training data generation, ML workflows, and distributed model training. Working closely with experienced engineers and researchers, you’ll contribute to systems that ensure our ML pipelines are reliable, scalable, and efficient. This role offers the opportunity to bridge research and production—translating advanced ideas into systems that operate at scale.

Responsibilities

Build and maintain data pipelines that generate training datasets for machine learning models and experimentation

Contribute to infrastructure that supports distributed training workflows (e.g., PyTorch, Ray)

Work with workflow orchestration tools (e.g., Airflow, Flyte, or similar) to support multi-stage ML pipelines

Improve reproducibility and reliability through dataset validation, monitoring, and testing

Partner with ML engineers to support experimentation and model iteration

Help optimize performance and efficiency across data processing and training systems

Contribute to the evolution of our offline ML platform architecture as it scales

Qualifications

Minimum

Bachelor's degree in Computer Science, Machine Learning, Systems, or a related field

Strong foundation in machine learning systems, distributed systems, or large-scale data processing (through research or projects)

Experience with Python and working with data-intensive workloads

Familiarity with ML frameworks (e.g., PyTorch, TensorFlow) and/or distributed systems (e.g., Ray, Spark)

Experience (academic or applied) with data pipelines, model training workflows, or large datasets

Strong problem-solving skills and ability to translate research ideas into practical systems

Interest in building scalable, reliable infrastructure for machine learning

Preferred

Experience with workflow orchestration systems (Airflow, Flyte, etc.)

Exposure to large-scale data platforms (data lakes, warehouses, streaming systems)

Publications or research in ML systems, distributed systems, or related areas