About the job
Join the AI Platform team at Roblox, where we are building the infrastructure that enables world-class AI/ML experiences. Our mission is to empower developers and creators by providing scalable, high-performance systems that drive AI innovation at an unprecedented scale. As an AI Platform Engineer specializing in ML Infrastructure, Training data Infrastructure and Feature Store Infrastructure, you will design and build next-generation ML data platforms, enabling AI applications across recommendations, search, safety, content generation, fraud detection, and more. Your work will directly impact millions of users, supporting AI-driven experiences across the entire Roblox ecosystem.
Responsibilities
Develop a scalable Feature Store: Architect and implement a system that supports both batch and real-time feature computation, storage, and serving, enabling low-latency AI inference at scale. Develop Sequence Feature Stores to collect raw user events and power training of transform models for recommendation engines at Roblox.
Develop Roblox’s first training data platform: Architecture and engineer training data collection, reusability and reliability while solving world class engineering problems like point-in-time joins, feature injection. Power batch and realtime model training.
Engineer large-scale ML data pipelines: Design, optimize, and maintain high-throughput streaming and batch ETL pipelines that ingest, process, and serve billions of records per day.
Build a feature engineering framework to empower 100s of machine learning engineers to develop, author, experiment and productionize millions of signals into the system quickly, easily and reliably.
Enable online and offline inference: Ensure real-time, low-latency feature access for AI inference while maintaining consistency and robustness for batch model training.
Collaborate with cross-functional AI teams: Partner with ML researchers, data engineers, and product teams to build the next-gen AI infrastructure that fuels innovation at Roblox.
Research & innovate: Stay at the forefront of AI infrastructure by exploring the latest in VectorDBs, Feature Stores, Graph ML, Agentic Systems and Real-time AI serving.
Qualifications
Minimum
5+ years of experience in AI/ML data infrastructure, feature stores, training data infrastructure. Experience working with industry leading NoSql databases like DynamoDB, Cassandra, cockroachDB and many more.
Expertise in building scalable ML data pipelines for both batch and real-time environments.
Strong software engineering skills with experience in distributed computing frameworks (e.g., Spark, Flink, Ray) and real-time data systems (e.g., Kafka, Redis, DynamoDB).
Deep knowledge of ML Feature Stores (e.g., Feast, Tecton, Vertex AI Feature Store) and embedding/vector search infrastructure.
Familiarity with Knowledge Graphs and graph-based ML applications (e.g., Neo4j, TigerGraph, AWS Neptune).
A Bachelor's degree in Computer Science, Engineering, or a related field
Preferred
No preferred qualifications listed.